[jira] [Commented] (HIVE-4018) MapJoin failing with Distributed Cache error
[ https://issues.apache.org/jira/browse/HIVE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616123#comment-13616123 ] Namit Jain commented on HIVE-4018: -- +1 Missed this -- running tests MapJoin failing with Distributed Cache error Key: HIVE-4018 URL: https://issues.apache.org/jira/browse/HIVE-4018 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.11.0 Attachments: HIVE-4018.patch, hive.4018.test.2.patch, HIVE-4018-test.patch When I'm a running a star join query after HIVE-3784, it is failing with following error: 2013-02-13 08:36:04,584 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2013-02-13 08:36:04,585 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:189) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:203) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1421) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:614) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.Child.main(Child.java:260) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4018) MapJoin failing with Distributed Cache error
[ https://issues.apache.org/jira/browse/HIVE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-4018: - Status: Open (was: Patch Available) Can you refresh ? phabricator diff is not applying cleanly. Can you also load the latest patch ? MapJoin failing with Distributed Cache error Key: HIVE-4018 URL: https://issues.apache.org/jira/browse/HIVE-4018 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.11.0 Attachments: HIVE-4018.patch, hive.4018.test.2.patch, HIVE-4018-test.patch When I'm a running a star join query after HIVE-3784, it is failing with following error: 2013-02-13 08:36:04,584 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2013-02-13 08:36:04,585 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:189) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:203) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1421) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:614) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.Child.main(Child.java:260) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2905) Desc table can't read Chinese (UTF-8 character code)
[ https://issues.apache.org/jira/browse/HIVE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaozhe Wang updated HIVE-2905: --- Affects Version/s: 0.10.0 Desc table can't read Chinese (UTF-8 character code) Key: HIVE-2905 URL: https://issues.apache.org/jira/browse/HIVE-2905 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.7.0, 0.10.0 Environment: hive 0.7.0, mysql 5.1.45 Reporter: Sheng Zhou When desc a table with command line or hive jdbc way, the table's comment can't be read. 1. I have updated javax.jdo.option.ConnectionURL parameter in hive-site.xml file. jdbc:mysql://*.*.*.*:3306/hive?characterEncoding=UTF-8 2. In mysql database, the comment field of COLUMNS table can be read normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2905) Desc table can't read Chinese (UTF-8 character code)
[ https://issues.apache.org/jira/browse/HIVE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaozhe Wang updated HIVE-2905: --- Environment: hive 0.7.0, mysql 5.1.45 hive 0.10.0, mysql 5.5.30 was:hive 0.7.0, mysql 5.1.45 Desc table can't read Chinese (UTF-8 character code) Key: HIVE-2905 URL: https://issues.apache.org/jira/browse/HIVE-2905 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.7.0, 0.10.0 Environment: hive 0.7.0, mysql 5.1.45 hive 0.10.0, mysql 5.5.30 Reporter: Sheng Zhou When desc a table with command line or hive jdbc way, the table's comment can't be read. 1. I have updated javax.jdo.option.ConnectionURL parameter in hive-site.xml file. jdbc:mysql://*.*.*.*:3306/hive?characterEncoding=UTF-8 2. In mysql database, the comment field of COLUMNS table can be read normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4221) Stripe-level merge for ORC files
[ https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4221: -- Attachment: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch sxyuan requested code review of HIVE-4221 [jira] Stripe-level merge for ORC files . Reviewers: kevinwilfong, omalley As with RC files, we would like to be able to merge ORC files efficiently by reading/writing stripes without deserializing each row. Most of the logic is unchanged from merging for RC files, so the original code has been refactored for reuse. TEST PLAN Copied and modified RC file merge tests to use ORC file format. Added a test case to TestOrcFile to make sure file level column stats are merged properly. REVISION DETAIL https://reviews.facebook.net/D9759 AFFECTED FILES data/files/smbbucket_1.orc data/files/smbbucket_3.orc data/files/smbbucket_2.orc common/src/java/org/apache/hadoop/hive/conf/HiveConf.java ql/src/test/results/clientpositive/orc_createas1.q.out ql/src/test/results/clientpositive/orcfile_merge3.q.out ql/src/test/results/clientpositive/orcfile_merge2.q.out ql/src/test/results/clientpositive/alter_merge_orc2.q.out ql/src/test/results/clientpositive/alter_merge_orc.q.out ql/src/test/results/clientpositive/orcfile_merge1.q.out ql/src/test/results/clientpositive/orcfile_merge4.q.out ql/src/test/results/clientpositive/alter_merge_orc_stats.q.out ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java ql/src/test/queries/clientpositive/orcfile_merge2.q ql/src/test/queries/clientpositive/orcfile_merge3.q ql/src/test/queries/clientpositive/alter_merge_orc.q ql/src/test/queries/clientpositive/orcfile_merge4.q ql/src/test/queries/clientpositive/alter_merge_orc_stats.q ql/src/test/queries/clientpositive/orcfile_merge1.q ql/src/test/queries/clientpositive/alter_merge_orc2.q ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeOutputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeRecordReader.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeInputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcMergeMapper.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/StripeReader.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java ql/src/java/org/apache/hadoop/hive/ql/io/merge ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeOutputFormat.java ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeTask.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/23295/ To: kevinwilfong, omalley, sxyuan Cc: JIRA Stripe-level merge for ORC files Key: HIVE-4221 URL: https://issues.apache.org/jira/browse/HIVE-4221 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Samuel Yuan Assignee: Samuel Yuan Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch As with RC files, we would like to be able to merge ORC files efficiently by reading/writing stripes without decompressing/recompressing them. This will be similar to the RC file merge, except that footers will have to be updated with the stripe positions in the new file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4221) Stripe-level merge for ORC files
[ https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Samuel Yuan updated HIVE-4221: -- Status: Patch Available (was: Open) Stripe-level merge for ORC files Key: HIVE-4221 URL: https://issues.apache.org/jira/browse/HIVE-4221 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Samuel Yuan Assignee: Samuel Yuan Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch As with RC files, we would like to be able to merge ORC files efficiently by reading/writing stripes without decompressing/recompressing them. This will be similar to the RC file merge, except that footers will have to be updated with the stripe positions in the new file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4018) MapJoin failing with Distributed Cache error
[ https://issues.apache.org/jira/browse/HIVE-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616156#comment-13616156 ] Amareshwari Sriramadasu commented on HIVE-4018: --- After Updating the patch to trunk, the test fails with NPE again. Will see whats the cause and update. MapJoin failing with Distributed Cache error Key: HIVE-4018 URL: https://issues.apache.org/jira/browse/HIVE-4018 Project: Hive Issue Type: Bug Components: SQL Affects Versions: 0.11.0 Reporter: Amareshwari Sriramadasu Assignee: Amareshwari Sriramadasu Fix For: 0.11.0 Attachments: HIVE-4018.patch, hive.4018.test.2.patch, HIVE-4018-test.patch When I'm a running a star join query after HIVE-3784, it is failing with following error: 2013-02-13 08:36:04,584 ERROR org.apache.hadoop.hive.ql.exec.MapJoinOperator: Load Distributed Cache Error 2013-02-13 08:36:04,585 FATAL ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.EOFException at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:189) at org.apache.hadoop.hive.ql.exec.MapJoinOperator.cleanUpInputFileChangedOp(MapJoinOperator.java:203) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1421) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1425) at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:614) at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:144) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325) at org.apache.hadoop.mapred.Child$4.run(Child.java:266) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) at org.apache.hadoop.mapred.Child.main(Child.java:260) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616174#comment-13616174 ] Namit Jain commented on HIVE-4240: -- https://reviews.facebook.net/D9765 optimize hive.enforce.bucketing and hive.enforce sorting insert --- Key: HIVE-4240 URL: https://issues.apache.org/jira/browse/HIVE-4240 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Consider the following scenario: set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; set hive.enforce.bucketing=true; set hive.enforce.sorting=true; set hive.exec.reducers.max = 1; set hive.merge.mapfiles=false; set hive.merge.mapredfiles=false; -- Create two bucketed and sorted tables CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; FROM src INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; -- Insert data into the bucketed table by selecting from another bucketed table -- This should be a map-only operation INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; We should not need a reducer to perform the above operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4240) optimize hive.enforce.bucketing and hive.enforce sorting insert
[ https://issues.apache.org/jira/browse/HIVE-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Namit Jain updated HIVE-4240: - Attachment: hive.4240.1.patch optimize hive.enforce.bucketing and hive.enforce sorting insert --- Key: HIVE-4240 URL: https://issues.apache.org/jira/browse/HIVE-4240 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Namit Jain Assignee: Namit Jain Attachments: hive.4240.1.patch Consider the following scenario: set hive.optimize.bucketmapjoin = true; set hive.optimize.bucketmapjoin.sortedmerge = true; set hive.input.format = org.apache.hadoop.hive.ql.io.BucketizedHiveInputFormat; set hive.enforce.bucketing=true; set hive.enforce.sorting=true; set hive.exec.reducers.max = 1; set hive.merge.mapfiles=false; set hive.merge.mapredfiles=false; -- Create two bucketed and sorted tables CREATE TABLE test_table1 (key INT, value STRING) PARTITIONED BY (ds STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; CREATE TABLE test_table2 (key INT, value STRING) PARTITIONED BY (ds STRING) CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS; FROM src INSERT OVERWRITE TABLE test_table1 PARTITION (ds = '1') SELECT *; -- Insert data into the bucketed table by selecting from another bucketed table -- This should be a map-only operation INSERT OVERWRITE TABLE test_table2 PARTITION (ds = '1') SELECT a.key, a.value FROM test_table1 a WHERE a.ds = '1'; We should not need a reducer to perform the above operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2905) Desc table can't read Chinese (UTF-8 character code)
[ https://issues.apache.org/jira/browse/HIVE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaozhe Wang updated HIVE-2905: --- Labels: patch (was: ) Status: Patch Available (was: Open) The problem is org.apache.hadoop.hive.ql.metadata.formatting.TextMetaDataFormatter.describeTable() use DataOutputStream.writeBytes() to output column info string. Unfortunately, DataOutputStream.writeBytes() will only write out lower byte of each character in the String, which cause garbling problem when column comment contains non-latin1 characters. This simple patch solved Unicode character garbling problem when describe table in Hive client. Desc table can't read Chinese (UTF-8 character code) Key: HIVE-2905 URL: https://issues.apache.org/jira/browse/HIVE-2905 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.10.0, 0.7.0 Environment: hive 0.7.0, mysql 5.1.45 hive 0.10.0, mysql 5.5.30 Reporter: Sheng Zhou Labels: patch When desc a table with command line or hive jdbc way, the table's comment can't be read. 1. I have updated javax.jdo.option.ConnectionURL parameter in hive-site.xml file. jdbc:mysql://*.*.*.*:3306/hive?characterEncoding=UTF-8 2. In mysql database, the comment field of COLUMNS table can be read normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2905) Desc table can't read Chinese (UTF-8 character code)
[ https://issues.apache.org/jira/browse/HIVE-2905?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaozhe Wang updated HIVE-2905: --- Attachment: utf8-desc-comment.patch Simple patch to resolve the garbling problem of column comment which contains unicode characters. Desc table can't read Chinese (UTF-8 character code) Key: HIVE-2905 URL: https://issues.apache.org/jira/browse/HIVE-2905 Project: Hive Issue Type: Bug Components: CLI Affects Versions: 0.7.0, 0.10.0 Environment: hive 0.7.0, mysql 5.1.45 hive 0.10.0, mysql 5.5.30 Reporter: Sheng Zhou Labels: patch Attachments: utf8-desc-comment.patch When desc a table with command line or hive jdbc way, the table's comment can't be read. 1. I have updated javax.jdo.option.ConnectionURL parameter in hive-site.xml file. jdbc:mysql://*.*.*.*:3306/hive?characterEncoding=UTF-8 2. In mysql database, the comment field of COLUMNS table can be read normally. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4242) Predicate push down should also be provided to InputFormats
Owen O'Malley created HIVE-4242: --- Summary: Predicate push down should also be provided to InputFormats Key: HIVE-4242 URL: https://issues.apache.org/jira/browse/HIVE-4242 Project: Hive Issue Type: Bug Components: StorageHandler Reporter: Owen O'Malley Assignee: Owen O'Malley Currently, the push down predicate is only provided to native tables if the hive.optimize.index.filter configuration variable is set. There is no reason to prevent InputFormats from getting the required information to do predicate push down. Obviously, this will be very useful for ORC. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4227) Add column level encryption to ORC files
[ https://issues.apache.org/jira/browse/HIVE-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616329#comment-13616329 ] Owen O'Malley commented on HIVE-4227: - Supun, I've tagged this for Google Summer of Code. Take a look at: http://www.google-melange.com/gsoc/homepage/google/gsoc2013 Add column level encryption to ORC files Key: HIVE-4227 URL: https://issues.apache.org/jira/browse/HIVE-4227 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Labels: gsoc, gsoc2013 It would be useful to support column level encryption in ORC files. Since each column and its associated index is stored separately, encrypting a column separately isn't difficult. In terms of key distribution, it would make sense to use an external server like the one in HADOOP-9331. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4243) Fix column names in FileSinkOperator
Owen O'Malley created HIVE-4243: --- Summary: Fix column names in FileSinkOperator Key: HIVE-4243 URL: https://issues.apache.org/jira/browse/HIVE-4243 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley All of the ObjectInspectors given to SerDe's by FileSinkOperator have virtual column names. Since the files are part of tables, Hive knows the column names. For self-describing file formats like ORC, having the real column names will improve the understandability. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-2162) Upgrade dependencies to Hadoop 0.20.2 and 0.20.203.0
[ https://issues.apache.org/jira/browse/HIVE-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-2162. - Resolution: Duplicate This has been fixed already. Upgrade dependencies to Hadoop 0.20.2 and 0.20.203.0 Key: HIVE-2162 URL: https://issues.apache.org/jira/browse/HIVE-2162 Project: Hive Issue Type: Improvement Reporter: Owen O'Malley Hadoop has released 0.20.203.0 and we should upgrade Hive's dependency to it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4244) Make string dictionaries adaptive in ORC
Owen O'Malley created HIVE-4244: --- Summary: Make string dictionaries adaptive in ORC Key: HIVE-4244 URL: https://issues.apache.org/jira/browse/HIVE-4244 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley The ORC writer should adaptively switch between dictionary and direct encoding. I'd propose looking at the first 100,000 values in each column and decide whether there is sufficient loading in the dictionary to use dictionary encoding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4245) Implement numeric dictionaries in ORC
Owen O'Malley created HIVE-4245: --- Summary: Implement numeric dictionaries in ORC Key: HIVE-4245 URL: https://issues.apache.org/jira/browse/HIVE-4245 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley For many applications, especially in de-normalized data, there is a lot of redundancy in the numeric columns. Therefore, it would make sense to adaptively use dictionary encodings for numeric columns in addition to string columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4121) ORC should have optional dictionaries for both strings and numeric types
[ https://issues.apache.org/jira/browse/HIVE-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Owen O'Malley resolved HIVE-4121. - Resolution: Duplicate I forgot I had filed this and filed the split apart on as HIVE-4244 and HIVE-4245. ORC should have optional dictionaries for both strings and numeric types Key: HIVE-4121 URL: https://issues.apache.org/jira/browse/HIVE-4121 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley Currently string columns always have dictionaries and numerics are always directly encoded. It would be better to make the encoding depend on a sample of the data. Perhaps the first 100k values should be evaluated for repeated values and the encoding picked for the stripe. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4227) Add column level encryption to ORC files
[ https://issues.apache.org/jira/browse/HIVE-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616393#comment-13616393 ] Andrew Purtell commented on HIVE-4227: -- So do you envision this as using the facilities provided by HADOOP-9331? Add column level encryption to ORC files Key: HIVE-4227 URL: https://issues.apache.org/jira/browse/HIVE-4227 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Labels: gsoc, gsoc2013 It would be useful to support column level encryption in ORC files. Since each column and its associated index is stored separately, encrypting a column separately isn't difficult. In terms of key distribution, it would make sense to use an external server like the one in HADOOP-9331. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-3959) Update Partition Statistics in Metastore Layer
[ https://issues.apache.org/jira/browse/HIVE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu reassigned HIVE-3959: -- Assignee: Gang Tim Liu (was: Bhushan Mandhani) Update Partition Statistics in Metastore Layer -- Key: HIVE-3959 URL: https://issues.apache.org/jira/browse/HIVE-3959 Project: Hive Issue Type: Improvement Components: Metastore, Statistics Reporter: Bhushan Mandhani Assignee: Gang Tim Liu Priority: Minor When partitions are created using queries (insert overwrite and insert into) then the StatsTask updates all stats. However, when partitions are added directly through metadata-only partitions (either CLI or direct calls to Thrift Metastore) no stats are populated even if hive.stats.reliable is set to true. This puts us in a situation where we can't decide if stats are truly reliable or not. We propose that the fast stats (numFiles and totalSize) which don't require a scan of the data should always be populated and be completely reliable. For now we are still excluding rowCount and rawDataSize because that will make these operations very expensive. Currently they are quick metadata-only ops. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4246) Implement predicate pushdown for ORC
Owen O'Malley created HIVE-4246: --- Summary: Implement predicate pushdown for ORC Key: HIVE-4246 URL: https://issues.apache.org/jira/browse/HIVE-4246 Project: Hive Issue Type: New Feature Reporter: Owen O'Malley Assignee: Owen O'Malley By using the push down predicates from the table scan operator, ORC can skip over 10,000 rows at a time that won't satisfy the predicate. This will help a lot, especially if the file is sorted by the column that is used in the predicate. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases
[ https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616502#comment-13616502 ] Gang Tim Liu commented on HIVE-4159: +1 RetryingHMSHandler doesn't retry in enough cases Key: HIVE-4159 URL: https://issues.apache.org/jira/browse/HIVE-4159 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4159.1.patch.txt HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in MetaExceptions. This caused the RetryingHMSHandler to not retry on these exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4155) Expose ORC's FileDump as a service
[ https://issues.apache.org/jira/browse/HIVE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616529#comment-13616529 ] Gang Tim Liu commented on HIVE-4155: +1 Expose ORC's FileDump as a service -- Key: HIVE-4155 URL: https://issues.apache.org/jira/browse/HIVE-4155 Project: Hive Issue Type: New Feature Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4155.1.patch.txt Expose ORC's FileDump class as a service similar to RC File Cat e.g. hive --orcfiledump path_to_file Should run FileDump on the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4244) Make string dictionaries adaptive in ORC
[ https://issues.apache.org/jira/browse/HIVE-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong reassigned HIVE-4244: --- Assignee: Kevin Wilfong (was: Owen O'Malley) Make string dictionaries adaptive in ORC Key: HIVE-4244 URL: https://issues.apache.org/jira/browse/HIVE-4244 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Kevin Wilfong The ORC writer should adaptively switch between dictionary and direct encoding. I'd propose looking at the first 100,000 values in each column and decide whether there is sufficient loading in the dictionary to use dictionary encoding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HIVE-4245) Implement numeric dictionaries in ORC
[ https://issues.apache.org/jira/browse/HIVE-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pamela Vagata reassigned HIVE-4245: --- Assignee: Pamela Vagata (was: Owen O'Malley) Implement numeric dictionaries in ORC - Key: HIVE-4245 URL: https://issues.apache.org/jira/browse/HIVE-4245 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Pamela Vagata For many applications, especially in de-normalized data, there is a lot of redundancy in the numeric columns. Therefore, it would make sense to adaptively use dictionary encodings for numeric columns in addition to string columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4244) Make string dictionaries adaptive in ORC
[ https://issues.apache.org/jira/browse/HIVE-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616547#comment-13616547 ] Kevin Wilfong commented on HIVE-4244: - Some initial thoughts based on some experiments. Dicitonary encoding seems to be less effective than just Zlib at compressing values if the number of distinct values is ~80% of the total number of values. This number can be configurable. It's still smaller in memory, so we may be able to get away with on writing the stripe, writing out the data directly there. This should be comparable in performance to converting the dictionary index that is already done. Also, if the uncompressed (but encoded) size of the dictionary + index (data stream) is greater than the size of the uncompressed size of the original data, the compressed data tends to be larger as well despite the sorting. This will be more expensive to figure out as we don't know the size of the index until it has been run length encoded. Make string dictionaries adaptive in ORC Key: HIVE-4244 URL: https://issues.apache.org/jira/browse/HIVE-4244 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Kevin Wilfong The ORC writer should adaptively switch between dictionary and direct encoding. I'd propose looking at the first 100,000 values in each column and decide whether there is sufficient loading in the dictionary to use dictionary encoding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4157) ORC runs out of heap when writing
[ https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616557#comment-13616557 ] Gang Tim Liu commented on HIVE-4157: +1 ORC runs out of heap when writing - Key: HIVE-4157 URL: https://issues.apache.org/jira/browse/HIVE-4157 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4157.1.patch.txt The OutStream class used by the ORC file format seems to aggressively allocate memory for ByteBuffers and doesn't seem too eager to give it back. This causes issues with heap space, particularly when a wide tables/dynamic partitions are involved. As a first step to resolving this problem, the OutStream class can be modified to lazily allocate memory, and more actively make it available for garbage collection. Follow ups could include checking the amount of free memory as part of determining if a spill is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
do we know release date for hive-0.11 eom
[jira] [Commented] (HIVE-3464) Merging join tree may reorder joins which could be invalid
[ https://issues.apache.org/jira/browse/HIVE-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616598#comment-13616598 ] Phabricator commented on HIVE-3464: --- vikram has commented on the revision HIVE-3464 [jira] Merging join tree may reorder joins which could be invalid. Comments. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java:332 I am not sure these changes are relevant to this jira. There are already other jiras - HIVE-3996 and HIVE-4071 raised for issues in this section of code and currently blocked on HIVE-3891 which moves these changes into a different class. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java:357 Same comment as above. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java:369 Same as above. ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java:381 Same as above. REVISION DETAIL https://reviews.facebook.net/D5409 To: JIRA, navis Cc: njain, vikram Merging join tree may reorder joins which could be invalid -- Key: HIVE-3464 URL: https://issues.apache.org/jira/browse/HIVE-3464 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3464.D5409.2.patch, HIVE-3464.D5409.3.patch, HIVE-3464.D5409.4.patch, HIVE-3464.D5409.5.patch Currently, hive merges join tree from right to left regardless of join types, which may introduce join reordering. For example, select * from a join a b on a.key=b.key join a c on b.key=c.key join a d on a.key=d.key; Hive tries to merge join tree in a-d=b-d, a-d=a-b, b-c=a-b order and a-d=a-b and b-c=a-b will be merged. Final join tree is a-(bdc). With this, ab-d join will be executed prior to ab-c. But if join type of -c and -d is different, this is not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4247) Filtering on a hbase row key duplicates results across multiple mappers
Karthik Kumara created HIVE-4247: Summary: Filtering on a hbase row key duplicates results across multiple mappers Key: HIVE-4247 URL: https://issues.apache.org/jira/browse/HIVE-4247 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0 Environment: All Platforms Reporter: Karthik Kumara Steps to reproduce 1. Create a Hive external table with HiveHbaseHandler with enough data in the hbase table to spawn multiple mappers for the hive query. 2. Write a query which has a filter (in the where clause) based on the hbase row key. 3. Running the map reduce job leads to each mapper querying the entire data set. duplicating the data for each mapper. Each mapper processes the entire filtered range and the results get multiplied as the number of mappers run. Expected behavior: Each mapper should process a different part of the data and should not duplicate. Cause: The cause seems to be the convertFilter method in HiveHBaseTableInputFormat. convertFilter has this piece of code which rewrites the start and the stop row for each split which leads each mapper to process the entire range if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } The scan already has the start and stop row set when the splits are created. So this piece of code is probably redundant. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4245) Implement numeric dictionaries in ORC
[ https://issues.apache.org/jira/browse/HIVE-4245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616613#comment-13616613 ] Owen O'Malley commented on HIVE-4245: - If you look at the original ORC github, you can see a float and double redblack tree that I pulled out in getting it ready for the initial push into Apache. https://github.com/hortonworks/orc/tree/9cdb2e88d377c801655fbb9015938ea3a93e12ca/src/main/java/org/apache/hadoop/hive/ql/io/orc Implement numeric dictionaries in ORC - Key: HIVE-4245 URL: https://issues.apache.org/jira/browse/HIVE-4245 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Pamela Vagata For many applications, especially in de-normalized data, there is a lot of redundancy in the numeric columns. Therefore, it would make sense to adaptively use dictionary encodings for numeric columns in addition to string columns. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4247) Filtering on a hbase row key duplicates results across multiple mappers
[ https://issues.apache.org/jira/browse/HIVE-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Kumara updated HIVE-4247: - Attachment: HiveHBaseTableInputFormat.patch Suggested patch Filtering on a hbase row key duplicates results across multiple mappers --- Key: HIVE-4247 URL: https://issues.apache.org/jira/browse/HIVE-4247 Project: Hive Issue Type: Bug Components: HBase Handler Affects Versions: 0.9.0 Environment: All Platforms Reporter: Karthik Kumara Labels: patch Attachments: HiveHBaseTableInputFormat.patch Steps to reproduce 1. Create a Hive external table with HiveHbaseHandler with enough data in the hbase table to spawn multiple mappers for the hive query. 2. Write a query which has a filter (in the where clause) based on the hbase row key. 3. Running the map reduce job leads to each mapper querying the entire data set. duplicating the data for each mapper. Each mapper processes the entire filtered range and the results get multiplied as the number of mappers run. Expected behavior: Each mapper should process a different part of the data and should not duplicate. Cause: The cause seems to be the convertFilter method in HiveHBaseTableInputFormat. convertFilter has this piece of code which rewrites the start and the stop row for each split which leads each mapper to process the entire range if (tableSplit != null) { tableSplit = new TableSplit( tableSplit.getTableName(), startRow, stopRow, tableSplit.getRegionLocation()); } The scan already has the start and stop row set when the splits are created. So this piece of code is probably redundant. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4244) Make string dictionaries adaptive in ORC
[ https://issues.apache.org/jira/browse/HIVE-4244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616657#comment-13616657 ] Owen O'Malley commented on HIVE-4244: - We should play with different values, but I was guessing the right cutover point for the heuristic was at a loading of 2 to 3 (50% to 33% distinct values). We aren't really going to know whether the heuristic is right or wrong unless we compare both encodings, which is much too expensive. By taking a good guess after looking at the start of the stripe, we can get good performance most of the time. Make string dictionaries adaptive in ORC Key: HIVE-4244 URL: https://issues.apache.org/jira/browse/HIVE-4244 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Kevin Wilfong The ORC writer should adaptively switch between dictionary and direct encoding. I'd propose looking at the first 100,000 values in each column and decide whether there is sufficient loading in the dictionary to use dictionary encoding. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4248) Implement a memory manager for ORC
Owen O'Malley created HIVE-4248: --- Summary: Implement a memory manager for ORC Key: HIVE-4248 URL: https://issues.apache.org/jira/browse/HIVE-4248 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley With the large default stripe size (256MB) and dynamic partitions, it is quite easy for users to run out of memory when writing ORC files. We probably need a solution that keeps track of the total number of concurrent ORC writers and divides the available heap space between them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4248) Implement a memory manager for ORC
[ https://issues.apache.org/jira/browse/HIVE-4248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616691#comment-13616691 ] Owen O'Malley commented on HIVE-4248: - This may result in ORC files with smaller stripes, but that seems far better than letting the users get out of memory exceptions. Implement a memory manager for ORC -- Key: HIVE-4248 URL: https://issues.apache.org/jira/browse/HIVE-4248 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley With the large default stripe size (256MB) and dynamic partitions, it is quite easy for users to run out of memory when writing ORC files. We probably need a solution that keeps track of the total number of concurrent ORC writers and divides the available heap space between them. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4197) Bring windowing support inline with SQL Standard
[ https://issues.apache.org/jira/browse/HIVE-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4197. Resolution: Fixed Assignee: Harish Butani Committed to branch. Thanks, Harish! Bring windowing support inline with SQL Standard Key: HIVE-4197 URL: https://issues.apache.org/jira/browse/HIVE-4197 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: WindowingSpecification.pdf The current behavior defers from the Standard in several significant places. Please review attached doc; there are still a few open issues. Once we agree on the behavior, can proceed with fixing the implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4190) OVER clauses with ORDER BY not getting windowing set properly
[ https://issues.apache.org/jira/browse/HIVE-4190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ashutosh Chauhan resolved HIVE-4190. Resolution: Fixed This patch is subsumed in HIVE-4197 which is now fixed. OVER clauses with ORDER BY not getting windowing set properly - Key: HIVE-4190 URL: https://issues.apache.org/jira/browse/HIVE-4190 Project: Hive Issue Type: Bug Components: PTF-Windowing Affects Versions: 0.11.0 Reporter: Alan Gates Given a query like: select s, avg(f) over (partition by si order by d) from over100k; Hive is not setting the window frame properly. The order by creates an implicit window frame of 'unbounded preceding' but Hive is treating the above query as if it has no window. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4199) ORC writer doesn't handle non-UTF8 encoded Text properly
[ https://issues.apache.org/jira/browse/HIVE-4199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4199: -- Attachment: HIVE-4199.HIVE-4199.HIVE-4199.D9501.4.patch sxyuan updated the revision HIVE-4199 [jira] ORC writer doesn't handle non-UTF8 encoded Text properly. Updated test case to clarify the expected behaviour. Reviewers: kevinwilfong REVISION DETAIL https://reviews.facebook.net/D9501 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D9501?vs=30009id=30675#toc AFFECTED FILES data/files/nonutf8.txt ql/src/test/results/clientpositive/orc_nonutf8.q.out ql/src/test/queries/clientpositive/orc_nonutf8.q ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringRedBlackTree.java ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java To: kevinwilfong, sxyuan Cc: JIRA ORC writer doesn't handle non-UTF8 encoded Text properly Key: HIVE-4199 URL: https://issues.apache.org/jira/browse/HIVE-4199 Project: Hive Issue Type: Bug Components: Serializers/Deserializers Reporter: Samuel Yuan Assignee: Samuel Yuan Priority: Minor Attachments: HIVE-4199.HIVE-4199.HIVE-4199.D9501.1.patch, HIVE-4199.HIVE-4199.HIVE-4199.D9501.2.patch, HIVE-4199.HIVE-4199.HIVE-4199.D9501.3.patch, HIVE-4199.HIVE-4199.HIVE-4199.D9501.4.patch StringTreeWriter currently converts fields stored as Text objects into Strings. This can lose information (see http://en.wikipedia.org/wiki/Replacement_character#Replacement_character), and is also unnecessary since the dictionary stores Text objects. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4249) current database is retained between sessions in hive server2
Thejas M Nair created HIVE-4249: --- Summary: current database is retained between sessions in hive server2 Key: HIVE-4249 URL: https://issues.apache.org/jira/browse/HIVE-4249 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.11.0 current database is retained between sessions in hive server2. To reproduce - Run this serveral times - bin/beeline -e '!connect jdbc:hive2://localhost:1 scott tiger org.apache.hive.jdbc.HiveDriver' -e 'show tables;' -e ' use newdb;' -e ' show tables;' table ab is a table in default database, newtab is a table in newdb database. Expected result is {code} +---+ | tab_name | +---+ | ab| +---+ 1 row selected (0.457 seconds) No rows affected (0.039 seconds) +---+ | tab_name | +---+ | newtab| +---+ {code} But after running it several, times you see threads having newdb as default database, ie the output of above command becomes - {code} +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.518 seconds) No rows affected (0.052 seconds) +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.232 seconds) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4250) Closing lots of RecordWriters is slow
Owen O'Malley created HIVE-4250: --- Summary: Closing lots of RecordWriters is slow Key: HIVE-4250 URL: https://issues.apache.org/jira/browse/HIVE-4250 Project: Hive Issue Type: New Feature Components: Serializers/Deserializers Reporter: Owen O'Malley Assignee: Owen O'Malley In FileSinkOperator, all of the RecordWriters are closed sequentially. For queries with a lot of dynamic partitions this can add substantially to the task time. For one query in particular, after processing all of the records in a few minutes the reduces spend 15 minutes closing all of the RC files. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4249) current database is retained between sessions in hive server2
[ https://issues.apache.org/jira/browse/HIVE-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616754#comment-13616754 ] Prasad Mujumdar commented on HIVE-4249: --- Looks like duplicate of [HIVE-4171|https://issues.apache.org/jira/browse/HIVE-4171] current database is retained between sessions in hive server2 --- Key: HIVE-4249 URL: https://issues.apache.org/jira/browse/HIVE-4249 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.11.0 current database is retained between sessions in hive server2. To reproduce - Run this serveral times - bin/beeline -e '!connect jdbc:hive2://localhost:1 scott tiger org.apache.hive.jdbc.HiveDriver' -e 'show tables;' -e ' use newdb;' -e ' show tables;' table ab is a table in default database, newtab is a table in newdb database. Expected result is {code} +---+ | tab_name | +---+ | ab| +---+ 1 row selected (0.457 seconds) No rows affected (0.039 seconds) +---+ | tab_name | +---+ | newtab| +---+ {code} But after running it several, times you see threads having newdb as default database, ie the output of above command becomes - {code} +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.518 seconds) No rows affected (0.052 seconds) +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.232 seconds) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases
[ https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616821#comment-13616821 ] Gang Tim Liu commented on HIVE-4159: Committed. thanks Kevin. RetryingHMSHandler doesn't retry in enough cases Key: HIVE-4159 URL: https://issues.apache.org/jira/browse/HIVE-4159 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4159.1.patch.txt HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in MetaExceptions. This caused the RetryingHMSHandler to not retry on these exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases
[ https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4159: --- Fix Version/s: 0.11.0 RetryingHMSHandler doesn't retry in enough cases Key: HIVE-4159 URL: https://issues.apache.org/jira/browse/HIVE-4159 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4159.1.patch.txt HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in MetaExceptions. This caused the RetryingHMSHandler to not retry on these exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases
[ https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4159: --- Resolution: Fixed Status: Resolved (was: Patch Available) RetryingHMSHandler doesn't retry in enough cases Key: HIVE-4159 URL: https://issues.apache.org/jira/browse/HIVE-4159 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4159.1.patch.txt HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in MetaExceptions. This caused the RetryingHMSHandler to not retry on these exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4155) Expose ORC's FileDump as a service
[ https://issues.apache.org/jira/browse/HIVE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616828#comment-13616828 ] Gang Tim Liu commented on HIVE-4155: Committed. thanks Kevin Expose ORC's FileDump as a service -- Key: HIVE-4155 URL: https://issues.apache.org/jira/browse/HIVE-4155 Project: Hive Issue Type: New Feature Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4155.1.patch.txt Expose ORC's FileDump class as a service similar to RC File Cat e.g. hive --orcfiledump path_to_file Should run FileDump on the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4155) Expose ORC's FileDump as a service
[ https://issues.apache.org/jira/browse/HIVE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4155: --- Resolution: Fixed Status: Resolved (was: Patch Available) Expose ORC's FileDump as a service -- Key: HIVE-4155 URL: https://issues.apache.org/jira/browse/HIVE-4155 Project: Hive Issue Type: New Feature Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4155.1.patch.txt Expose ORC's FileDump class as a service similar to RC File Cat e.g. hive --orcfiledump path_to_file Should run FileDump on the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4155) Expose ORC's FileDump as a service
[ https://issues.apache.org/jira/browse/HIVE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4155: --- Fix Version/s: 0.11.0 Expose ORC's FileDump as a service -- Key: HIVE-4155 URL: https://issues.apache.org/jira/browse/HIVE-4155 Project: Hive Issue Type: New Feature Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4155.1.patch.txt Expose ORC's FileDump class as a service similar to RC File Cat e.g. hive --orcfiledump path_to_file Should run FileDump on the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4119) ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty
[ https://issues.apache.org/jira/browse/HIVE-4119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616855#comment-13616855 ] Carl Steinbach commented on HIVE-4119: -- +1. Will commit if tests pass. ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty - Key: HIVE-4119 URL: https://issues.apache.org/jira/browse/HIVE-4119 Project: Hive Issue Type: Bug Components: Statistics Affects Versions: 0.10.0 Reporter: Lenni Kuff Assignee: Shreepadma Venugopalan Priority: Critical Attachments: HIVE-4119.1.patch, HIVE-4119.2.patch ANALYZE TABLE ... COMPUTE STATISTICS FOR COLUMNS fails with NPE if the table is empty {code} hive -e create table empty_table (i int); select compute_stats(i, 16) from empty_table java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:35) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getInt(PrimitiveObjectInspectorUtils.java:535) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFComputeStats$GenericUDAFLongStatsEvaluator.iterate(GenericUDAFComputeStats.java:477) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139) at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1099) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:231) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1132) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:558) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:567) at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:193) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:428) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:231) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableIntObjectInspector.get(WritableIntObjectInspector.java:35) at org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getInt(PrimitiveObjectInspectorUtils.java:535) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFComputeStats$GenericUDAFLongStatsEvaluator.iterate(GenericUDAFComputeStats.java:477) at org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139) at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:1099) ... 15 more org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.NullPointerException at
[jira] [Created] (HIVE-4251) Indices can't be built on tables who's schema info comes from SerDe
Mark Wagner created HIVE-4251: - Summary: Indices can't be built on tables who's schema info comes from SerDe Key: HIVE-4251 URL: https://issues.apache.org/jira/browse/HIVE-4251 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.10.1 Reporter: Mark Wagner Assignee: Mark Wagner Building indices on tables who get the schema information from the deserializer (e.g. Avro backed tables) doesn't work because when the column is checked to exist, the correct API isn't used. {code} hive describe doctors; OK # col_name data_type comment number int from deserializer first_name string from deserializer last_name string from deserializer Time taken: 0.215 seconds, Fetched: 5 row(s) hive create index doctors_index on table doctors(number) as 'compact' with deferred rebuild; FAILED: Error in metadata: java.lang.RuntimeException: Check the index columns, they should appear in the table being indexed. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4251) Indices can't be built on tables who's schema info comes from SerDe
[ https://issues.apache.org/jira/browse/HIVE-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Wagner updated HIVE-4251: -- Attachment: HIVE-4251.1.patch The attached patch fixes this for both 0.10 branch and trunk. Indices can't be built on tables who's schema info comes from SerDe --- Key: HIVE-4251 URL: https://issues.apache.org/jira/browse/HIVE-4251 Project: Hive Issue Type: Bug Affects Versions: 0.11.0, 0.10.1 Reporter: Mark Wagner Assignee: Mark Wagner Attachments: HIVE-4251.1.patch Building indices on tables who get the schema information from the deserializer (e.g. Avro backed tables) doesn't work because when the column is checked to exist, the correct API isn't used. {code} hive describe doctors; OK # col_namedata_type comment numberint from deserializer first_namestring from deserializer last_name string from deserializer Time taken: 0.215 seconds, Fetched: 5 row(s) hive create index doctors_index on table doctors(number) as 'compact' with deferred rebuild; FAILED: Error in metadata: java.lang.RuntimeException: Check the index columns, they should appear in the table being indexed. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4251) Indices can't be built on tables who's schema info comes from SerDe
[ https://issues.apache.org/jira/browse/HIVE-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Wagner updated HIVE-4251: -- Fix Version/s: 0.10.1 0.11.0 Affects Version/s: 0.10.0 Status: Patch Available (was: Open) Indices can't be built on tables who's schema info comes from SerDe --- Key: HIVE-4251 URL: https://issues.apache.org/jira/browse/HIVE-4251 Project: Hive Issue Type: Bug Affects Versions: 0.10.0, 0.11.0, 0.10.1 Reporter: Mark Wagner Assignee: Mark Wagner Fix For: 0.11.0, 0.10.1 Attachments: HIVE-4251.1.patch Building indices on tables who get the schema information from the deserializer (e.g. Avro backed tables) doesn't work because when the column is checked to exist, the correct API isn't used. {code} hive describe doctors; OK # col_namedata_type comment numberint from deserializer first_namestring from deserializer last_name string from deserializer Time taken: 0.215 seconds, Fetched: 5 row(s) hive create index doctors_index on table doctors(number) as 'compact' with deferred rebuild; FAILED: Error in metadata: java.lang.RuntimeException: Check the index columns, they should appear in the table being indexed. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4157) ORC runs out of heap when writing
[ https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616898#comment-13616898 ] Gang Tim Liu commented on HIVE-4157: Committed. thanks Kevin ORC runs out of heap when writing - Key: HIVE-4157 URL: https://issues.apache.org/jira/browse/HIVE-4157 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Attachments: HIVE-4157.1.patch.txt The OutStream class used by the ORC file format seems to aggressively allocate memory for ByteBuffers and doesn't seem too eager to give it back. This causes issues with heap space, particularly when a wide tables/dynamic partitions are involved. As a first step to resolving this problem, the OutStream class can be modified to lazily allocate memory, and more actively make it available for garbage collection. Follow ups could include checking the amount of free memory as part of determining if a spill is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4157) ORC runs out of heap when writing
[ https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4157: --- Resolution: Fixed Status: Resolved (was: Patch Available) ORC runs out of heap when writing - Key: HIVE-4157 URL: https://issues.apache.org/jira/browse/HIVE-4157 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4157.1.patch.txt The OutStream class used by the ORC file format seems to aggressively allocate memory for ByteBuffers and doesn't seem too eager to give it back. This causes issues with heap space, particularly when a wide tables/dynamic partitions are involved. As a first step to resolving this problem, the OutStream class can be modified to lazily allocate memory, and more actively make it available for garbage collection. Follow ups could include checking the amount of free memory as part of determining if a spill is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4157) ORC runs out of heap when writing
[ https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gang Tim Liu updated HIVE-4157: --- Fix Version/s: 0.11.0 ORC runs out of heap when writing - Key: HIVE-4157 URL: https://issues.apache.org/jira/browse/HIVE-4157 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4157.1.patch.txt The OutStream class used by the ORC file format seems to aggressively allocate memory for ByteBuffers and doesn't seem too eager to give it back. This causes issues with heap space, particularly when a wide tables/dynamic partitions are involved. As a first step to resolving this problem, the OutStream class can be modified to lazily allocate memory, and more actively make it available for garbage collection. Follow ups could include checking the amount of free memory as part of determining if a spill is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4157) ORC runs out of heap when writing
[ https://issues.apache.org/jira/browse/HIVE-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616901#comment-13616901 ] Gang Tim Liu commented on HIVE-4157: Forgot to mention: tests passed. sorry ORC runs out of heap when writing - Key: HIVE-4157 URL: https://issues.apache.org/jira/browse/HIVE-4157 Project: Hive Issue Type: Improvement Components: Serializers/Deserializers Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4157.1.patch.txt The OutStream class used by the ORC file format seems to aggressively allocate memory for ByteBuffers and doesn't seem too eager to give it back. This causes issues with heap space, particularly when a wide tables/dynamic partitions are involved. As a first step to resolving this problem, the OutStream class can be modified to lazily allocate memory, and more actively make it available for garbage collection. Follow ups could include checking the amount of free memory as part of determining if a spill is needed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4159) RetryingHMSHandler doesn't retry in enough cases
[ https://issues.apache.org/jira/browse/HIVE-4159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616902#comment-13616902 ] Gang Tim Liu commented on HIVE-4159: Forgot to mention: tests passed. sorry RetryingHMSHandler doesn't retry in enough cases Key: HIVE-4159 URL: https://issues.apache.org/jira/browse/HIVE-4159 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4159.1.patch.txt HIVE-3524 introduced a change which caused JDOExceptions to be wrapped in MetaExceptions. This caused the RetryingHMSHandler to not retry on these exceptions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4155) Expose ORC's FileDump as a service
[ https://issues.apache.org/jira/browse/HIVE-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616903#comment-13616903 ] Gang Tim Liu commented on HIVE-4155: Forgot to mention: tests passed. sorry Expose ORC's FileDump as a service -- Key: HIVE-4155 URL: https://issues.apache.org/jira/browse/HIVE-4155 Project: Hive Issue Type: New Feature Affects Versions: 0.11.0 Reporter: Kevin Wilfong Assignee: Kevin Wilfong Fix For: 0.11.0 Attachments: HIVE-4155.1.patch.txt Expose ORC's FileDump class as a service similar to RC File Cat e.g. hive --orcfiledump path_to_file Should run FileDump on the file. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-3464) Merging join tree may reorder joins which could be invalid
[ https://issues.apache.org/jira/browse/HIVE-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616932#comment-13616932 ] Phabricator commented on HIVE-3464: --- navis has commented on the revision HIVE-3464 [jira] Merging join tree may reorder joins which could be invalid. INLINE COMMENTS ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/CommonJoinResolver.java:332 Ok, it's on other issue. Will be removed. Any other comments on changes? REVISION DETAIL https://reviews.facebook.net/D5409 To: JIRA, navis Cc: njain, vikram Merging join tree may reorder joins which could be invalid -- Key: HIVE-3464 URL: https://issues.apache.org/jira/browse/HIVE-3464 Project: Hive Issue Type: Bug Components: Query Processor Affects Versions: 0.10.0 Reporter: Navis Assignee: Navis Attachments: HIVE-3464.D5409.2.patch, HIVE-3464.D5409.3.patch, HIVE-3464.D5409.4.patch, HIVE-3464.D5409.5.patch Currently, hive merges join tree from right to left regardless of join types, which may introduce join reordering. For example, select * from a join a b on a.key=b.key join a c on b.key=c.key join a d on a.key=d.key; Hive tries to merge join tree in a-d=b-d, a-d=a-b, b-c=a-b order and a-d=a-b and b-c=a-b will be merged. Final join tree is a-(bdc). With this, ab-d join will be executed prior to ab-c. But if join type of -c and -d is different, this is not valid. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HIVE-4249) current database is retained between sessions in hive server2
[ https://issues.apache.org/jira/browse/HIVE-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair resolved HIVE-4249. - Resolution: Duplicate Thanks Prasad for pointing that out! Marking as duplicate. current database is retained between sessions in hive server2 --- Key: HIVE-4249 URL: https://issues.apache.org/jira/browse/HIVE-4249 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair Assignee: Thejas M Nair Fix For: 0.11.0 current database is retained between sessions in hive server2. To reproduce - Run this serveral times - bin/beeline -e '!connect jdbc:hive2://localhost:1 scott tiger org.apache.hive.jdbc.HiveDriver' -e 'show tables;' -e ' use newdb;' -e ' show tables;' table ab is a table in default database, newtab is a table in newdb database. Expected result is {code} +---+ | tab_name | +---+ | ab| +---+ 1 row selected (0.457 seconds) No rows affected (0.039 seconds) +---+ | tab_name | +---+ | newtab| +---+ {code} But after running it several, times you see threads having newdb as default database, ie the output of above command becomes - {code} +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.518 seconds) No rows affected (0.052 seconds) +---+ | tab_name | +---+ | newtab| +---+ 1 row selected (0.232 seconds) {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4179) NonBlockingOpDeDup does not merge SEL operators correctly
[ https://issues.apache.org/jira/browse/HIVE-4179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616938#comment-13616938 ] Navis commented on HIVE-4179: - minor comments on phabricator NonBlockingOpDeDup does not merge SEL operators correctly - Key: HIVE-4179 URL: https://issues.apache.org/jira/browse/HIVE-4179 Project: Hive Issue Type: Bug Reporter: Gunther Hagleitner Assignee: Gunther Hagleitner Priority: Critical Fix For: 0.11.0 Attachments: HIVE-4179.1.patch, HIVE-4179.2.patch, HIVE-4179.3.patch The input columns list for SEL operations isn't merged properly in the optimization. The best way to see this is running union_remove_22.q with -Dhadoop.mr.rev=23. The plan shows lost UDFs and a broken lineage for one column. Note: union_remove tests do not run on hadoop 1 or 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4171) Current database in metastore.Hive is not consistent with SessionState
[ https://issues.apache.org/jira/browse/HIVE-4171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4171: Attachment: HIVE-4171.3.patch I think it would be better to store this information in HiveConf and remove the member from Hive class. This would mean that there will be only one source for truth for his information (instead of having it in Hive and SessionState classes). I can submit another patch with fix for the TODO in patch and unit tests if you agree . HIVE-4171.3.patch (also in https://reviews.apache.org/r/10180/ ) Current database in metastore.Hive is not consistent with SessionState -- Key: HIVE-4171 URL: https://issues.apache.org/jira/browse/HIVE-4171 Project: Hive Issue Type: Bug Components: CLI Reporter: Navis Assignee: Navis Labels: HiveServer2 Attachments: HIVE-4171.3.patch, HIVE-4171.D9399.1.patch, HIVE-4171.D9399.2.patch metastore.Hive is thread local instance, which can have different status with SessionState. Currently the only status in metastore.Hive is database name in use. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kevin Wilfong updated HIVE-4235: Resolution: Fixed Fix Version/s: 0.11.0 Status: Resolved (was: Patch Available) Committed, thanks Tim. CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Fix For: 0.11.0 Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4235) CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists
[ https://issues.apache.org/jira/browse/HIVE-4235?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616964#comment-13616964 ] Gang Tim Liu commented on HIVE-4235: Kevin, thank you very much. Tim CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists Key: HIVE-4235 URL: https://issues.apache.org/jira/browse/HIVE-4235 Project: Hive Issue Type: Bug Components: JDBC, Query Processor, SQL Reporter: Gang Tim Liu Assignee: Gang Tim Liu Fix For: 0.11.0 Attachments: HIVE-4235.patch.1 CREATE TABLE IF NOT EXISTS uses inefficient way to check if table exists. It uses Hive.java's getTablesByPattern(...) to check if table exists. It involves regular expression and eventually database join. Very efficient. It can cause database lock time increase and hurt db performance if a lot of such commands hit database. The suggested approach is to use getTable(...) since we know tablename already -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
[ https://issues.apache.org/jira/browse/HIVE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13616988#comment-13616988 ] Thejas M Nair commented on HIVE-4194: - Another non-binding +1 . (do non-binding +1's add up :) ) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL -- Key: HIVE-4194 URL: https://issues.apache.org/jira/browse/HIVE-4194 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Richard Ding Assignee: Richard Ding Fix For: 0.11.0 Attachments: HIVE-4194.patch As per JDBC 3.0 Spec (section 9.2) If the Driver implementation understands the URL, it will return a Connection object; otherwise it returns null Currently HiveConnection constructor will throw IllegalArgumentException if url string doesn't start with jdbc:hive2. This exception should be caught by HiveDriver.connect and return null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4252) hiveserver2 string representation of complex types are inconsistent with cli
Thejas M Nair created HIVE-4252: --- Summary: hiveserver2 string representation of complex types are inconsistent with cli Key: HIVE-4252 URL: https://issues.apache.org/jira/browse/HIVE-4252 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair For example, it prints struct as [null, null, null] instead of {\r\:null,\s\:null,\t\:null} And for maps it is printing it as {k=v} instead of {\k\:\v\} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4252) hiveserver2 string representation of complex types are inconsistent with cli
[ https://issues.apache.org/jira/browse/HIVE-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4252: Attachment: HIVE-4252.1.patch hiveserver2 string representation of complex types are inconsistent with cli Key: HIVE-4252 URL: https://issues.apache.org/jira/browse/HIVE-4252 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4252.1.patch For example, it prints struct as [null, null, null] instead of {\r\:null,\s\:null,\t\:null} And for maps it is printing it as {k=v} instead of {\k\:\v\} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4252) hiveserver2 string representation of complex types are inconsistent with cli
[ https://issues.apache.org/jira/browse/HIVE-4252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thejas M Nair updated HIVE-4252: Status: Patch Available (was: Open) hiveserver2 string representation of complex types are inconsistent with cli Key: HIVE-4252 URL: https://issues.apache.org/jira/browse/HIVE-4252 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.11.0 Reporter: Thejas M Nair Assignee: Thejas M Nair Attachments: HIVE-4252.1.patch For example, it prints struct as [null, null, null] instead of {\r\:null,\s\:null,\t\:null} And for maps it is printing it as {k=v} instead of {\k\:\v\} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4253) use jdbc complex types for hive complex types
Thejas M Nair created HIVE-4253: --- Summary: use jdbc complex types for hive complex types Key: HIVE-4253 URL: https://issues.apache.org/jira/browse/HIVE-4253 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Thejas M Nair The hiveserver2 jdbc driver is converting the complex types into strings. It will be better to use suitable java objects as per jdbc spec. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4109) Partition by column does not have to be in order by
[ https://issues.apache.org/jira/browse/HIVE-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617045#comment-13617045 ] Harish Butani commented on HIVE-4109: - this should be fixed with 4197 Partition by column does not have to be in order by --- Key: HIVE-4109 URL: https://issues.apache.org/jira/browse/HIVE-4109 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Brock Noland Cam up in the review of HIVE-4093. Ashutosh {noformat} I am not sure if this is illegal query. I tried following two queries in postgres, both of them succeeded. select p_mfgr, avg(p_retailprice) over(partition by p_mfgr, p_type order by p_mfgr) from part; select p_mfgr, avg(p_retailprice) over(partition by p_mfgr order by p_type,p_mfgr) from part; {noformat} Harish {noformat} The first one doesn't make sense, right? Order on a subset of the partition columns The second one: Can we do this with the Hive ReduceOp have the orderColumns be in a different order than the key columns? {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4254) Code cleanup : debug methods, having clause associated with Windowing
Harish Butani created HIVE-4254: --- Summary: Code cleanup : debug methods, having clause associated with Windowing Key: HIVE-4254 URL: https://issues.apache.org/jira/browse/HIVE-4254 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani - remove debug functions in SemanticAnalyzer - remove code dealing with having clause associated with Windowing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4194) JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL
[ https://issues.apache.org/jira/browse/HIVE-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Carl Steinbach updated HIVE-4194: - Status: Open (was: Patch Available) We shouldn't let people instantiate malformed HiveConnection objects. Please make the HiveConnection constructor private and add static builder methods to HiveConnection (e.g. HiveConnection.newConnection(String url, Properties info)) that validate the input URL and return null if it's invalid. Please also relocated acceptsURL() to HiveConnection and make it private. Thanks. JDBC2: HiveDriver should not throw RuntimeException when passed an invalid URL -- Key: HIVE-4194 URL: https://issues.apache.org/jira/browse/HIVE-4194 Project: Hive Issue Type: Bug Components: HiveServer2, JDBC Affects Versions: 0.11.0 Reporter: Richard Ding Assignee: Richard Ding Fix For: 0.11.0 Attachments: HIVE-4194.patch As per JDBC 3.0 Spec (section 9.2) If the Driver implementation understands the URL, it will return a Connection object; otherwise it returns null Currently HiveConnection constructor will throw IllegalArgumentException if url string doesn't start with jdbc:hive2. This exception should be caught by HiveDriver.connect and return null. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.
[ https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617075#comment-13617075 ] Carl Steinbach commented on HIVE-2264: -- @Navis: we relaxed that rule. You can commit your own patches as long as you get a +1 from another committer. You're good to go. Hive server is SHUTTING DOWN when invalid queries beeing executed. -- Key: HIVE-2264 URL: https://issues.apache.org/jira/browse/HIVE-2264 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.9.0 Environment: SuSE-Linux-11 Reporter: rohithsharma Assignee: Navis Priority: Blocker Fix For: 0.11.0 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch, HIVE-2264.D9489.1.patch When invalid query is beeing executed, Hive server is shutting down. {noformat} CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040' ALTER TABLE SAMPLETABLE add Partition(ds='sf') location '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse' {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4254) Code cleanup : debug methods, having clause associated with Windowing
[ https://issues.apache.org/jira/browse/HIVE-4254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4254: -- Attachment: HIVE-4254.D9795.1.patch hbutani requested code review of HIVE-4254 [jira] Code cleanup : debug methods, having clause associated with Windowing. Reviewers: JIRA, ashutoshc cleanup code remove debug functions in SemanticAnalyzer remove code dealing with having clause associated with Windowing TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D9795 AFFECTED FILES data/files/flights_tiny.txt data/files/part.rc data/files/part.seq ql/src/java/org/apache/hadoop/hive/ql/exec/PTFOperator.java ql/src/java/org/apache/hadoop/hive/ql/optimizer/ColumnPrunerProcFactory.java ql/src/java/org/apache/hadoop/hive/ql/parse/PTFTranslator.java ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingComponentizer.java ql/src/java/org/apache/hadoop/hive/ql/parse/WindowingSpec.java ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDesc.java ql/src/java/org/apache/hadoop/hive/ql/plan/PTFDeserializer.java MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/23361/ To: JIRA, ashutoshc, hbutani Code cleanup : debug methods, having clause associated with Windowing - Key: HIVE-4254 URL: https://issues.apache.org/jira/browse/HIVE-4254 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4254.D9795.1.patch - remove debug functions in SemanticAnalyzer - remove code dealing with having clause associated with Windowing -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HIVE-4255) update show_functions.q.out for functions added for windowing
Harish Butani created HIVE-4255: --- Summary: update show_functions.q.out for functions added for windowing Key: HIVE-4255 URL: https://issues.apache.org/jira/browse/HIVE-4255 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4255) update show_functions.q.out for functions added for windowing
[ https://issues.apache.org/jira/browse/HIVE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617093#comment-13617093 ] Harish Butani commented on HIVE-4255: - patch is attached. update show_functions.q.out for functions added for windowing - Key: HIVE-4255 URL: https://issues.apache.org/jira/browse/HIVE-4255 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4255.1.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4255) update show_functions.q.out for functions added for windowing
[ https://issues.apache.org/jira/browse/HIVE-4255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harish Butani updated HIVE-4255: Attachment: HIVE-4255.1.patch.txt update show_functions.q.out for functions added for windowing - Key: HIVE-4255 URL: https://issues.apache.org/jira/browse/HIVE-4255 Project: Hive Issue Type: Bug Components: PTF-Windowing Reporter: Harish Butani Assignee: Harish Butani Attachments: HIVE-4255.1.patch.txt -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.
[ https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Navis updated HIVE-2264: Resolution: Fixed Status: Resolved (was: Patch Available) Committed and was my first commit. Thanks to all. Hive server is SHUTTING DOWN when invalid queries beeing executed. -- Key: HIVE-2264 URL: https://issues.apache.org/jira/browse/HIVE-2264 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.9.0 Environment: SuSE-Linux-11 Reporter: rohithsharma Assignee: Navis Priority: Blocker Fix For: 0.11.0 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch, HIVE-2264.D9489.1.patch When invalid query is beeing executed, Hive server is shutting down. {noformat} CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040' ALTER TABLE SAMPLETABLE add Partition(ds='sf') location '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse' {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-2264) Hive server is SHUTTING DOWN when invalid queries beeing executed.
[ https://issues.apache.org/jira/browse/HIVE-2264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13617110#comment-13617110 ] Phabricator commented on HIVE-2264: --- navis has abandoned the revision HIVE-2264 [jira] Hive server is SHUTTING DOWN when invalid queries beeing executed.. Committed REVISION DETAIL https://reviews.facebook.net/D9489 To: JIRA, navis Hive server is SHUTTING DOWN when invalid queries beeing executed. -- Key: HIVE-2264 URL: https://issues.apache.org/jira/browse/HIVE-2264 Project: Hive Issue Type: Bug Components: HiveServer2, Query Processor Affects Versions: 0.9.0 Environment: SuSE-Linux-11 Reporter: rohithsharma Assignee: Navis Priority: Blocker Fix For: 0.11.0 Attachments: HIVE-2264.1.patch.txt, HIVE-2264-2.patch, HIVE-2264.D9489.1.patch When invalid query is beeing executed, Hive server is shutting down. {noformat} CREATE TABLE SAMPLETABLE(IP STRING , showtime BIGINT ) partitioned by (ds string,ipz int) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\040' ALTER TABLE SAMPLETABLE add Partition(ds='sf') location '/user/hive/warehouse' Partition(ipz=100) location '/user/hive/warehouse' {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira