[jira] [Created] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds
Cli: Print Hadoop's CPU milliseconds Key: HIVE-2236 URL: https://issues.apache.org/jira/browse/HIVE-2236 Project: Hive Issue Type: New Feature Components: CLI Reporter: Siying Dong Priority: Minor CPU Milliseonds information is available from Hadoop's framework. Printing it out to Hive CLI when executing a job will help users to know more about their jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Review Request: Cli: Print Hadoop's CPU milliseconds
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/948/ --- Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Summary --- In hive CLI, print out CPU msec from Hadoop MapReduce coutners. This addresses bug HIVE-2236. https://issues.apache.org/jira/browse/HIVE-2236 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1138748 Diff: https://reviews.apache.org/r/948/diff Testing --- run the updated codes against real clusters and make sure it printing is correct. Thanks, Siying
[jira] [Commented] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds
[ https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053684#comment-13053684 ] jirapos...@reviews.apache.org commented on HIVE-2236: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/948/ --- Review request for hive, Yongqiang He, Ning Zhang, and namit jain. Summary --- In hive CLI, print out CPU msec from Hadoop MapReduce coutners. This addresses bug HIVE-2236. https://issues.apache.org/jira/browse/HIVE-2236 Diffs - trunk/ql/src/java/org/apache/hadoop/hive/ql/Driver.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/DriverContext.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/MapRedStats.java PRE-CREATION trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/ExecDriver.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/HadoopJobExecHelper.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1138748 trunk/ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java 1138748 Diff: https://reviews.apache.org/r/948/diff Testing --- run the updated codes against real clusters and make sure it printing is correct. Thanks, Siying Cli: Print Hadoop's CPU milliseconds Key: HIVE-2236 URL: https://issues.apache.org/jira/browse/HIVE-2236 Project: Hive Issue Type: New Feature Components: CLI Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2236.1.patch CPU Milliseonds information is available from Hadoop's framework. Printing it out to Hive CLI when executing a job will help users to know more about their jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-1537) Allow users to specify LOCATION in CREATE DATABASE statement
[ https://issues.apache.org/jira/browse/HIVE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053764#comment-13053764 ] jirapos...@reviews.apache.org commented on HIVE-1537: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/949/ --- Review request for hive, Ning Zhang and Amareshwari Sriramadasu. Summary --- Usage: create database location 'path1'; alter database location 'path2'; After 'alter', only newly created tables will be located under the new location. Tables created before 'alter' will be under 'path1'. Notes: -- 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it private. There should only be one API to obtain the location of a database and it has to accept 'Database' as an arg and hence the new method in Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of older API also has been changed. Hope that should be fine. 2. One could argue why have getDatabasePath() as location can be obtained by db.getLocationUri(). I wanted to retain this method to do any additional processing if necessary (getDns or whatever). This addresses bug HIVE-1537. https://issues.apache.org/jira/browse/HIVE-1537 Diffs - trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 1138011 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 1138011 trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION trunk/ql/src/test/results/clientpositive/database_location.q.out PRE-CREATION Diff: https://reviews.apache.org/r/949/diff Testing --- 1. Updated TestHiveMetaStore.java for testing the functionality - database creation, alteration and table's locations as TestCliDriver outputs ignore locations. 2. Added database_location.q for testing the grammar primarily. Thanks, Thiruvel Thanks, Thiruvel Allow users to specify LOCATION in CREATE DATABASE statement Key: HIVE-1537 URL: https://issues.apache.org/jira/browse/HIVE-1537 Project: Hive Issue Type: New Feature Components: Metastore Reporter: Carl Steinbach Assignee: Thiruvel Thirumoolan Attachments: HIVE-1537.patch, hive-1537.metastore.part.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2230) Hive Client build error
[ https://issues.apache.org/jira/browse/HIVE-2230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053806#comment-13053806 ] Bennie Schut commented on HIVE-2230: I talked with Dmytro offline and this line on the wiki should probably changed: The Hive ODBC driver was developed with Thrift trunk version r790732, but the latest revision should also be fine. Hive 0.7 and higher uses thrift 0.5.0. I'm not sure what happens when you mix with a newer version of thrift but the older version (r790732) doesn't seem to work. I would probably advice others to use 0.5.0. Hive Client build error --- Key: HIVE-2230 URL: https://issues.apache.org/jira/browse/HIVE-2230 Project: Hive Issue Type: Bug Components: Clients, ODBC Environment: hive: {code} Path: . URL: http://svn.apache.org/repos/asf/hive/trunk Repository Root: http://svn.apache.org/repos/asf Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 1138016 Node Kind: directory Schedule: normal Last Changed Author: jvs Last Changed Rev: 1137839 Last Changed Date: 2011-06-21 03:41:17 +0200 (Tue, 21 Jun 2011) {code} thrift: {code} Path: . URL: http://svn.apache.org/repos/asf/thrift/trunk Repository Root: http://svn.apache.org/repos/asf Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68 Revision: 1138011 Node Kind: directory Schedule: normal Last Changed Author: molinaro Last Changed Rev: 1137870 Last Changed Date: 2011-06-21 08:20:18 +0200 (Tue, 21 Jun 2011) {code} Reporter: Dmytro Korochkin While running ant {code} ant compile-cpp -Dthrift.home=/usr/local {code} to build Hive Client according to http://wiki.apache.org/hadoop/Hive/HiveODBC I've got following error message: {code} compile-cpp: [exec] mkdir -p /home/ubuntu/hive/build/metastore/objs [exec] g++ -Wall -g -fPIC -m32 -DARCH32 -I/usr/local/include/thrift -I/usr/local/include/thrift/fb303 -I/include -I/home/ubuntu/hive/service/src/gen/thrift/gen-cpp -I/home/ubuntu/hive/ql/src/gen/thrift/gen-cpp -I/home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp -I/home/ubuntu/hive/odbc/src/cpp -c /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp -o /home/ubuntu/hive/build/metastore/objs/ThriftHiveMetastore.o [exec] /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp: In member function 'virtual bool Apache::Hadoop::Hive::ThriftHiveMetastoreProcessor::process_fn(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string, int32_t)': [exec] /home/ubuntu/hive/metastore/src/gen/thrift/gen-cpp/ThriftHiveMetastore.cpp:18014:92: error: no matching function for call to 'Apache::Hadoop::Hive::ThriftHiveMetastoreProcessor::process_fn(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string, int32_t)' [exec] /usr/local/include/thrift/fb303/FacebookService.h:1299:16: note: candidate is: virtual bool facebook::fb303::FacebookServiceProcessor::process_fn(apache::thrift::protocol::TProtocol*, apache::thrift::protocol::TProtocol*, std::string, int32_t, void*) [exec] make: *** [/home/ubuntu/hive/build/metastore/objs/ThriftHiveMetastore.o] Error 1 BUILD FAILED {code} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: trunk busted?
OK, I guess it's because I'm hitting the permissions mystery discussed with you and Paul back-channel. We need to get that resolved. JVS On Jun 22, 2011, at 10:20 PM, Ning Zhang wrote: FYI my test just succeeded on the clean check out of trunk. On Jun 22, 2011, at 5:14 PM, yongqiang he wrote: database.q failed me when testing HIVE-2100 On Wed, Jun 22, 2011 at 2:23 PM, John Sichi jsi...@fb.com wrote: Yeah, that's one of the failures (out of many different ones) that Jenkins has been hitting (see the end of this log): https://builds.apache.org/view/G-L/view/Hive/job/Hive-trunk-h0.21/788/console It's sporadic, probably based on server load. JVS On Jun 22, 2011, at 2:10 PM, Ning Zhang wrote: John, here's what I got for 'ant clean package'. It seems ivy is flaky now? ivy-download: [get] Getting: http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar [get] To: /data/users/nzhang/reviews/2/apache-hive/build/ivy/lib/ivy-2.1.0.jar [get] Error getting http://repo2.maven.org/maven2/org/apache/ivy/ivy/2.1.0/ivy-2.1.0.jar to /data/users/nzhang/reviews/2/apache-hive/build/ivy/lib/ivy-2.1.0.jar BUILD FAILED /data/users/nzhang/reviews/2/apache-hive/build.xml:196: The following error occurred while executing this line: /data/users/nzhang/reviews/2/apache-hive/build.xml:130: The following error occurred while executing this line: /data/users/nzhang/reviews/2/apache-hive/build-common.xml:128: java.net.ConnectException: Connection refused On Jun 22, 2011, at 12:43 PM, John Sichi wrote: Yeah, all tests passed when I committed the bitmap indexes, so I'm not sure what's up. JVS On Jun 22, 2011, at 12:36 PM, Ning Zhang wrote: trunk was fine the last time I committed. John the last ones who committed were Carl (branching 0.7.1) and you (bitmap index). :) Did you get all the tests passed? I'll test with a clean checkout. On Jun 22, 2011, at 12:02 PM, John Sichi wrote: Are other committers able to pass tests on Hive trunk? I'm getting lots of failures, and Jenkins seems to have been barfing for a while too. JVS
Re: Review Request: HIVE-1537 - Allow users to specify LOCATION in CREATE DATABASE statement
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/949/#review898 --- trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java https://reviews.apache.org/r/949/#comment1938 This may not be always successful. You may fail to create dirs for number of reasons. So, this needs to be handled gracefully. Transaction needs to rollback in such case and create database ddl needs to fail. For more info, look the first comment of Devaraj and also his attached partial patch. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java https://reviews.apache.org/r/949/#comment1941 As previously, mkdirs() can fail, so handle similarly as in createDatabase() trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java https://reviews.apache.org/r/949/#comment1942 Please also add a test when a create database fails because a FS operation fails. In such a case no metadata should get created. One way to simulate that is to make location unwritable then try to create database on that location. - Ashutosh On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/949/ --- (Updated 2011-06-23 09:55:50) Review request for hive, Ning Zhang and Amareshwari Sriramadasu. Summary --- Usage: create database location 'path1'; alter database location 'path2'; After 'alter', only newly created tables will be located under the new location. Tables created before 'alter' will be under 'path1'. Notes: -- 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it private. There should only be one API to obtain the location of a database and it has to accept 'Database' as an arg and hence the new method in Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of older API also has been changed. Hope that should be fine. 2. One could argue why have getDatabasePath() as location can be obtained by db.getLocationUri(). I wanted to retain this method to do any additional processing if necessary (getDns or whatever). This addresses bug HIVE-1537. https://issues.apache.org/jira/browse/HIVE-1537 Diffs - trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1138011 trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 1138011 trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1138011 trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 1138011 trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION trunk/ql/src/test/results/clientpositive/database_location.q.out PRE-CREATION Diff: https://reviews.apache.org/r/949/diff Testing --- 1. Updated TestHiveMetaStore.java for testing the functionality - database creation, alteration and table's locations as TestCliDriver outputs ignore locations. 2. Added database_location.q for testing the grammar primarily. Thanks, Thiruvel Thanks, Thiruvel
[jira] [Commented] (HIVE-1537) Allow users to specify LOCATION in CREATE DATABASE statement
[ https://issues.apache.org/jira/browse/HIVE-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053959#comment-13053959 ] jirapos...@reviews.apache.org commented on HIVE-1537: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/949/#review898 --- trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java https://reviews.apache.org/r/949/#comment1938 This may not be always successful. You may fail to create dirs for number of reasons. So, this needs to be handled gracefully. Transaction needs to rollback in such case and create database ddl needs to fail. For more info, look the first comment of Devaraj and also his attached partial patch. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java https://reviews.apache.org/r/949/#comment1941 As previously, mkdirs() can fail, so handle similarly as in createDatabase() trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java https://reviews.apache.org/r/949/#comment1942 Please also add a test when a create database fails because a FS operation fails. In such a case no metadata should get created. One way to simulate that is to make location unwritable then try to create database on that location. - Ashutosh On 2011-06-23 09:55:50, Thiruvel Thirumoolan wrote: bq. bq. --- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/949/ bq. --- bq. bq. (Updated 2011-06-23 09:55:50) bq. bq. bq. Review request for hive, Ning Zhang and Amareshwari Sriramadasu. bq. bq. bq. Summary bq. --- bq. bq. Usage: bq. bq. create database location 'path1'; bq. alter database location 'path2'; bq. bq. After 'alter', only newly created tables will be located under the new location. Tables created before 'alter' will be under 'path1'. bq. bq. Notes: bq. -- bq. 1. I have moved getDefaultDatabasePath() to HiveMetaStore and made it private. There should only be one API to obtain the location of a database and it has to accept 'Database' as an arg and hence the new method in Warehouse 'getDatabasePath()' and similarly 'getTablePath()'. The usages of older API also has been changed. Hope that should be fine. bq. 2. One could argue why have getDatabasePath() as location can be obtained by db.getLocationUri(). I wanted to retain this method to do any additional processing if necessary (getDns or whatever). bq. bq. bq. This addresses bug HIVE-1537. bq. https://issues.apache.org/jira/browse/HIVE-1537 bq. bq. bq. Diffs bq. - bq. bq. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java 1138011 bq. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java 1138011 bq. trunk/metastore/src/java/org/apache/hadoop/hive/metastore/ObjectStore.java 1138011 bq.trunk/metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java 1138011 bq. trunk/metastore/src/test/org/apache/hadoop/hive/metastore/TestHiveMetaStore.java 1138011 bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1138011 bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/StatsTask.java 1138011 bq. trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1138011 bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/Hive.g 1138011 bq. trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/ImportSemanticAnalyzer.java 1138011 bq.trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java 1138011 bq. trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzerFactory.java 1138011 bq.trunk/ql/src/test/org/apache/hadoop/hive/ql/metadata/TestHive.java 1138011 bq.trunk/ql/src/test/queries/clientpositive/database_location.q PRE-CREATION bq.trunk/ql/src/test/results/clientpositive/database_location.q.out PRE-CREATION bq. bq. Diff: https://reviews.apache.org/r/949/diff bq. bq. bq. Testing bq. --- bq. bq. 1. Updated TestHiveMetaStore.java for testing the functionality - database creation, alteration and table's locations as TestCliDriver outputs ignore locations. bq. 2. Added database_location.q for testing the grammar primarily. bq. bq. Thanks, bq. Thiruvel bq. bq. bq. Thanks, bq. bq. Thiruvel bq. bq. Allow users to specify LOCATION in CREATE DATABASE statement Key: HIVE-1537 URL: https://issues.apache.org/jira/browse/HIVE-1537 Project: Hive Issue
Re: Review Request: HIVE-2035 Use block level merge on rcfile if intermediate merge is needed
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/935/ --- (Updated 2011-06-23 18:56:14.903379) Review request for hive. Changes --- Add max and min split size configs to unit tests Summary --- For a table stored as RCFile, intermediate results are sometimes merged if those files are below a certain threshold. For RCFiles, we can do a block level merge that does not deserialize the blocks and is more efficient. This patch leverages the existing code used to merge for ALTER TABLE ... CONCATENATE. This addresses bug HIVE-2035. https://issues.apache.org/jira/browse/HIVE-2035 Diffs (updated) - trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1139014 trunk/ql/src/test/queries/clientpositive/rcfile_createas1.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge1.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge2.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge3.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge4.q PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_createas1.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge1.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge2.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge3.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge4.q.out PRE-CREATION Diff: https://reviews.apache.org/r/935/diff Testing --- Thanks, Franklin
[jira] [Updated] (HIVE-2035) Use block-level merge for RCFile if merging intermediate results are needed
[ https://issues.apache.org/jira/browse/HIVE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Franklin Hu updated HIVE-2035: -- Attachment: hive-2035.3.patch Add min/max split size settings to unit tests Use block-level merge for RCFile if merging intermediate results are needed --- Key: HIVE-2035 URL: https://issues.apache.org/jira/browse/HIVE-2035 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Franklin Hu Attachments: hive-2035.1.patch, hive-2035.3.patch Currently if hive.merge.mapredfiles and/or hive.merge.mapfile is set to true the intermediate data could be merged using an additional MapReduce job. This could be quite expensive if the data size is large. With HIVE-1950, merging can be done in the RCFile block level so that it bypasses the (de-)compression, (de-)serialization phases. This could improve the merge process significantly. This JIRA should handle the case where the input table is not stored in RCFile, but the destination table is (which requires the intermediate data should be stored in the same format as the destination table). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2035) Use block-level merge for RCFile if merging intermediate results are needed
[ https://issues.apache.org/jira/browse/HIVE-2035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054033#comment-13054033 ] jirapos...@reviews.apache.org commented on HIVE-2035: - --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/935/ --- (Updated 2011-06-23 18:56:14.903379) Review request for hive. Changes --- Add max and min split size configs to unit tests Summary --- For a table stored as RCFile, intermediate results are sometimes merged if those files are below a certain threshold. For RCFiles, we can do a block level merge that does not deserialize the blocks and is more efficient. This patch leverages the existing code used to merge for ALTER TABLE ... CONCATENATE. This addresses bug HIVE-2035. https://issues.apache.org/jira/browse/HIVE-2035 Diffs (updated) - trunk/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeRecordReader.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileKeyBufferWrapper.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverMergeFiles.java 1139014 trunk/ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 1139014 trunk/ql/src/test/queries/clientpositive/rcfile_createas1.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge1.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge2.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge3.q PRE-CREATION trunk/ql/src/test/queries/clientpositive/rcfile_merge4.q PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_createas1.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge1.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge2.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge3.q.out PRE-CREATION trunk/ql/src/test/results/clientpositive/rcfile_merge4.q.out PRE-CREATION Diff: https://reviews.apache.org/r/935/diff Testing --- Thanks, Franklin Use block-level merge for RCFile if merging intermediate results are needed --- Key: HIVE-2035 URL: https://issues.apache.org/jira/browse/HIVE-2035 Project: Hive Issue Type: Improvement Reporter: Ning Zhang Assignee: Franklin Hu Attachments: hive-2035.1.patch, hive-2035.3.patch Currently if hive.merge.mapredfiles and/or hive.merge.mapfile is set to true the intermediate data could be merged using an additional MapReduce job. This could be quite expensive if the data size is large. With HIVE-1950, merging can be done in the RCFile block level so that it bypasses the (de-)compression, (de-)serialization phases. This could improve the merge process significantly. This JIRA should handle the case where the input table is not stored in RCFile, but the destination table is (which requires the intermediate data should be stored in the same format as the destination table). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Jenkins build is back to normal : Hive-trunk-h0.21 #790
See https://builds.apache.org/job/Hive-trunk-h0.21/790/
[jira] [Commented] (HIVE-2215) Add api for marking / querying set of partitions for events
[ https://issues.apache.org/jira/browse/HIVE-2215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054057#comment-13054057 ] Hudson commented on HIVE-2215: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) Add api for marking / querying set of partitions for events --- Key: HIVE-2215 URL: https://issues.apache.org/jira/browse/HIVE-2215 Project: Hive Issue Type: New Feature Components: Metastore Affects Versions: 0.8.0 Reporter: Ashutosh Chauhan Assignee: Ashutosh Chauhan Fix For: 0.8.0 Attachments: hive-2215_full-1.patch, hive_2215.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2218) speedup addInputPaths
[ https://issues.apache.org/jira/browse/HIVE-2218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054060#comment-13054060 ] Hudson commented on HIVE-2218: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) speedup addInputPaths - Key: HIVE-2218 URL: https://issues.apache.org/jira/browse/HIVE-2218 Project: Hive Issue Type: Improvement Reporter: He Yongqiang Assignee: He Yongqiang Fix For: 0.8.0 Attachments: HIVE-2218.1.patch, HIVE-2218.2.patch, HIVE-2218.3.patch Speedup the addInputPaths for combined symlink inputformat, and added some other micro optimizations which also work for normal cases. This can help reducing the start time of one query from 5 hours to less than 20 mins. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2176) Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus
[ https://issues.apache.org/jira/browse/HIVE-2176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054059#comment-13054059 ] Hudson commented on HIVE-2176: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) Schema creation scripts are incomplete since they leave out tables that are specific to DataNucleus --- Key: HIVE-2176 URL: https://issues.apache.org/jira/browse/HIVE-2176 Project: Hive Issue Type: Bug Components: Configuration, Metastore Affects Versions: 0.5.0, 0.6.0, 0.7.0 Reporter: Esteban Gutierrez Assignee: Esteban Gutierrez Labels: derby, mysql, postgres Fix For: 0.7.1, 0.8.0 Attachments: HIVE-2176.3.patch.txt When using the DDL SQL scripts to create the Metastore, tables like SEQUENCE_TABLE are missing and force the user to change the configuration to use Datanucleus to do all the provisioning of the Metastore tables. Adding the missing table definitions to the DDL scripts will allow to have a functional Hive Metastore without enabling additional privileges to the Metastore user and/or enabling datanucleus.autoCreateSchema property in hive-site.xml [After running the hive-schema-0.7.0.mysql.sql and revoking ALTER and CREATE privileges to the 'metastoreuser'] hive show tables; FAILED: Error in metadata: javax.jdo.JDOException: Exception thrown calling table.exists() for `SEQUENCE_TABLE` NestedThrowables: com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: CREATE command denied to user 'metastoreuser'@'localhost' for table 'SEQUENCE_TABLE' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2158) add the HivePreparedStatement implementation based on current HIVE supported data-type
[ https://issues.apache.org/jira/browse/HIVE-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054061#comment-13054061 ] Hudson commented on HIVE-2158: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) add the HivePreparedStatement implementation based on current HIVE supported data-type -- Key: HIVE-2158 URL: https://issues.apache.org/jira/browse/HIVE-2158 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: 0.6.0, 0.7.0, 0.8.0 Reporter: Yuanjun Li Assignee: Yuanjun Li Fix For: 0.7.1, 0.8.0 Attachments: HIVE-0.7.1-PreparedStatement.1.patch.txt, HIVE-0.8-PreparedStatement.1.patch.txt -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2036) Update bitmap indexes for automatic usage
[ https://issues.apache.org/jira/browse/HIVE-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054058#comment-13054058 ] Hudson commented on HIVE-2036: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) Update bitmap indexes for automatic usage - Key: HIVE-2036 URL: https://issues.apache.org/jira/browse/HIVE-2036 Project: Hive Issue Type: Improvement Components: Indexing Affects Versions: 0.8.0 Reporter: Russell Melick Assignee: Syed S. Albiz Fix For: 0.8.0 Attachments: HIVE-2036.1.patch, HIVE-2036.3.patch, HIVE-2036.8.patch HIVE-1644 will provide automatic usage of indexes, and HIVE-1803 adds bitmap index support. The bitmap code will need to be extended after it is committed to enable automatic use of indexing. Most work will be focused in the BitmapIndexHandler, which needs to generate the re-entrant QL index query. There may also be significant work in the IndexPredicateAnalyzer to support predicates with OR's, instead of just AND's as it is currently. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2222) runnable queue in Driver and DriverContext is not thread safe
[ https://issues.apache.org/jira/browse/HIVE-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054063#comment-13054063 ] Hudson commented on HIVE-: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) runnable queue in Driver and DriverContext is not thread safe - Key: HIVE- URL: https://issues.apache.org/jira/browse/HIVE- Project: Hive Issue Type: Bug Reporter: He Yongqiang Assignee: Namit Jain Attachments: hive..1.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2140) Return correct Major / Minor version numbers for Hive Driver
[ https://issues.apache.org/jira/browse/HIVE-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054062#comment-13054062 ] Hudson commented on HIVE-2140: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) Return correct Major / Minor version numbers for Hive Driver Key: HIVE-2140 URL: https://issues.apache.org/jira/browse/HIVE-2140 Project: Hive Issue Type: Sub-task Components: JDBC Affects Versions: 0.6.0, 0.7.0 Reporter: Curtis Boyden Assignee: Curtis Boyden Fix For: 0.7.1, 0.8.0 Attachments: hive-0.6-driver-version.patch, hive-0.7-driver-version.patch, hive-trunk-driver-version.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2213) Optimize partial specification metastore functions
[ https://issues.apache.org/jira/browse/HIVE-2213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054064#comment-13054064 ] Hudson commented on HIVE-2213: -- Integrated in Hive-trunk-h0.21 #790 (See [https://builds.apache.org/job/Hive-trunk-h0.21/790/]) Optimize partial specification metastore functions -- Key: HIVE-2213 URL: https://issues.apache.org/jira/browse/HIVE-2213 Project: Hive Issue Type: Improvement Components: Metastore Reporter: Sohan Jain Assignee: Sohan Jain Fix For: 0.8.0 Attachments: HIVE-2213.1.patch, HIVE-2213.3.patch If a table has a large number of partitions, get_partition_names_ps() make take a long time to execute, because we get all of the partition names from the database. This is not very memory efficient, and the operation can be pushed down to the JDO layer without getting all of the names first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java - Key: HIVE-2237 URL: https://issues.apache.org/jira/browse/HIVE-2237 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0 Reporter: Patrick Hunt Assignee: Patrick Hunt I see the following error in helios eclipse with the latest trunk (although build on the command line is fine): Syntax error on token ;, delete this token seems to have been introduced by this change in HIVE-2036 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;; I have a patch forthcoming. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
[ https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated HIVE-2237: --- Attachment: HIVE-2237.patch patch to remove the extra semi hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java - Key: HIVE-2237 URL: https://issues.apache.org/jira/browse/HIVE-2237 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Attachments: HIVE-2237.patch I see the following error in helios eclipse with the latest trunk (although build on the command line is fine): Syntax error on token ;, delete this token seems to have been introduced by this change in HIVE-2036 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;; I have a patch forthcoming. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
[ https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Hunt updated HIVE-2237: --- Status: Patch Available (was: Open) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java - Key: HIVE-2237 URL: https://issues.apache.org/jira/browse/HIVE-2237 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Attachments: HIVE-2237.patch I see the following error in helios eclipse with the latest trunk (although build on the command line is fine): Syntax error on token ;, delete this token seems to have been introduced by this change in HIVE-2036 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;; I have a patch forthcoming. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2237) hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java
[ https://issues.apache.org/jira/browse/HIVE-2237?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Sichi updated HIVE-2237: - Resolution: Fixed Hadoop Flags: [Reviewed] Status: Resolved (was: Patch Available) +1, committed to trunk. Thanks Patrick! hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java - Key: HIVE-2237 URL: https://issues.apache.org/jira/browse/HIVE-2237 Project: Hive Issue Type: Bug Components: Build Infrastructure Affects Versions: 0.8.0 Reporter: Patrick Hunt Assignee: Patrick Hunt Attachments: HIVE-2237.patch I see the following error in helios eclipse with the latest trunk (although build on the command line is fine): Syntax error on token ;, delete this token seems to have been introduced by this change in HIVE-2036 +import org.apache.hadoop.hive.ql.index.HiveIndexedInputFormat;; I have a patch forthcoming. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054116#comment-13054116 ] Carl Steinbach commented on HIVE-895: - @jakob: The code on github looks really good. The release branch for 0.8.0 is going to get created sometime in the next couple of weeks. Do you think it will be possible to get a patch ready for review before then? Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Jeff Hammerbacher Assignee: Jakob Homan As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-895) Add SerDe for Avro serialized data
[ https://issues.apache.org/jira/browse/HIVE-895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054124#comment-13054124 ] Jakob Homan commented on HIVE-895: -- A couple weeks is probably not feasible. Assuming 0.9 comes out in a few months after that, that's probably a better bet. Add SerDe for Avro serialized data -- Key: HIVE-895 URL: https://issues.apache.org/jira/browse/HIVE-895 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Jeff Hammerbacher Assignee: Jakob Homan As Avro continues to mature, having a SerDe to allow HiveQL queries over Avro data seems like a solid win. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2236) Cli: Print Hadoop's CPU milliseconds
[ https://issues.apache.org/jira/browse/HIVE-2236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siying Dong updated HIVE-2236: -- Status: Patch Available (was: Open) Cli: Print Hadoop's CPU milliseconds Key: HIVE-2236 URL: https://issues.apache.org/jira/browse/HIVE-2236 Project: Hive Issue Type: New Feature Components: CLI Reporter: Siying Dong Assignee: Siying Dong Priority: Minor Attachments: HIVE-2236.1.patch CPU Milliseonds information is available from Hadoop's framework. Printing it out to Hive CLI when executing a job will help users to know more about their jobs. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories
[ https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054188#comment-13054188 ] Siying Dong commented on HIVE-2201: --- ping reduce name node calls in hive by creating temporary directories Key: HIVE-2201 URL: https://issues.apache.org/jira/browse/HIVE-2201 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Siying Dong Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch Currently, in Hive, when a file gets written by a FileSinkOperator, the sequence of operations is as follows: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp1/1 3. Move directory /tmp1 to /tmp2 4. For all files in /tmp2, remove all files starting with _tmp and duplicate files. Due to speculative execution, a lot of temporary files are created in /tmp1 (or /tmp2). This leads to a lot of name node calls, specially for large queries. The protocol above can be modified slightly: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp2/1 3. Move directory /tmp2 to /tmp3 4. For all files in /tmp3, remove all duplicate files. This should reduce the number of tmp files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2201) reduce name node calls in hive by creating temporary directories
[ https://issues.apache.org/jira/browse/HIVE-2201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054200#comment-13054200 ] He Yongqiang commented on HIVE-2201: i will take a look... reduce name node calls in hive by creating temporary directories Key: HIVE-2201 URL: https://issues.apache.org/jira/browse/HIVE-2201 Project: Hive Issue Type: Improvement Reporter: Namit Jain Assignee: Siying Dong Attachments: HIVE-2201.1.patch, HIVE-2201.2.patch, HIVE-2201.3.patch Currently, in Hive, when a file gets written by a FileSinkOperator, the sequence of operations is as follows: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp1/1 3. Move directory /tmp1 to /tmp2 4. For all files in /tmp2, remove all files starting with _tmp and duplicate files. Due to speculative execution, a lot of temporary files are created in /tmp1 (or /tmp2). This leads to a lot of name node calls, specially for large queries. The protocol above can be modified slightly: 1. In tmp directory tmp1, create a tmp file _tmp_1 2. At the end of the operator, move /tmp1/_tmp_1 to /tmp2/1 3. Move directory /tmp2 to /tmp3 4. For all files in /tmp3, remove all duplicate files. This should reduce the number of tmp files. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hive-trunk-h0.21 #791
See https://builds.apache.org/job/Hive-trunk-h0.21/791/changes Changes: [jvs] HIVE-2237. hive fails to build in eclipse due to syntax error in BitmapIndexHandler.java (Patrick Hunt via jvs) -- [...truncated 30941 lines...] [junit] OK [junit] PREHOOK: query: select count(1) as cnt from testhivedrivertable [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-06-23_18-48-32_025_6710146509679189022/-mr-1 [junit] Total MapReduce jobs = 1 [junit] Launching Job 1 out of 1 [junit] Number of reduce tasks determined at compile time: 1 [junit] In order to change the average load for a reducer (in bytes): [junit] set hive.exec.reducers.bytes.per.reducer=number [junit] In order to limit the maximum number of reducers: [junit] set hive.exec.reducers.max=number [junit] In order to set a constant number of reducers: [junit] set mapred.reduce.tasks=number [junit] Job running in-process (local Hadoop) [junit] Hadoop job information for null: number of mappers: 0; number of reducers: 0 [junit] 2011-06-23 18:48:35,141 null map = 100%, reduce = 100% [junit] Ended Job = job_local_0001 [junit] POSTHOOK: query: select count(1) as cnt from testhivedrivertable [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-06-23_18-48-32_025_6710146509679189022/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106231848_247218233.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] PREHOOK: type: LOAD [junit] PREHOOK: Output: default@testhivedrivertable [junit] Copying data from https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt [junit] Loading data to table default.testhivedrivertable [junit] POSTHOOK: query: load data local inpath 'https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/data/files/kv1.txt' into table testhivedrivertable [junit] POSTHOOK: type: LOAD [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] PREHOOK: query: select * from testhivedrivertable limit 10 [junit] PREHOOK: type: QUERY [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: file:/tmp/hudson/hive_2011-06-23_18-48-36_668_4580842764428959983/-mr-1 [junit] POSTHOOK: query: select * from testhivedrivertable limit 10 [junit] POSTHOOK: type: QUERY [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: file:/tmp/hudson/hive_2011-06-23_18-48-36_668_4580842764428959983/-mr-1 [junit] OK [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] PREHOOK: Input: default@testhivedrivertable [junit] PREHOOK: Output: default@testhivedrivertable [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] POSTHOOK: Input: default@testhivedrivertable [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK [junit] Hive history file=https://builds.apache.org/job/Hive-trunk-h0.21/ws/hive/build/service/tmp/hive_job_log_hudson_201106231848_1722739147.txt [junit] PREHOOK: query: drop table testhivedrivertable [junit] PREHOOK: type: DROPTABLE [junit] POSTHOOK: query: drop table testhivedrivertable [junit] POSTHOOK: type: DROPTABLE [junit] OK [junit] PREHOOK: query: create table testhivedrivertable (num int) [junit] PREHOOK: type: CREATETABLE [junit] POSTHOOK: query: create table testhivedrivertable (num int) [junit] POSTHOOK: type: CREATETABLE [junit] POSTHOOK: Output: default@testhivedrivertable [junit] OK