[jira] [Commented] (HIVE-17426) Execution framework in hive to run tasks in parallel other than MR Tasks
[ https://issues.apache.org/jira/browse/HIVE-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160785#comment-16160785 ] Hive QA commented on HIVE-17426: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886342/HIVE-17426.5.patch {color:green}SUCCESS:{color} +1 due to 2 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 11026 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestLocationQueries - did not produce a TEST-*.xml file (likely timed out) (batchId=218) TestReplicationScenariosAcrossInstances - did not produce a TEST-*.xml file (likely timed out) (batchId=218) TestSemanticAnalyzerHookLoading - did not produce a TEST-*.xml file (likely timed out) (batchId=218) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6762/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6762/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6762/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886342 - PreCommit-HIVE-Build > Execution framework in hive to run tasks in parallel other than MR Tasks > > > Key: HIVE-17426 > URL: https://issues.apache.org/jira/browse/HIVE-17426 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-17426.0.patch, HIVE-17426.1.patch, > HIVE-17426.2.patch, HIVE-17426.3.patch, HIVE-17426.4.patch, HIVE-17426.5.patch > > > the execution framework currently only runs MR Tasks in parallel when {{set > hive.exec.parallel=true}}. > Allow other types of tasks to run in parallel as well to support replication > scenarios in hive. TezTask / SparkTask will still not be allowed to run in > parallel. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17498) Does hive have mr-nativetask support refer to MAPREDUCE-2841
[ https://issues.apache.org/jira/browse/HIVE-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Feng Yuan updated HIVE-17498: - Description: I try to implement a HivePlatform extends org.apache.hadoop.mapred.nativetask.Platform. {code} /** * Licensed to the Apache Software Foundation (ASF) under one * or more contributor license agreements. See the NOTICE file * distributed with this work for additional information * regarding copyright ownership. The ASF licenses this file * to you under the Apache License, Version 2.0 (the * "License"); you may not use this file except in compliance * with the License. You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package org.apache.hadoop.mapred.nativetask; import org.apache.hadoop.hive.ql.io.HiveKey; import org.apache.hadoop.mapred.JobConf; import org.apache.hadoop.mapred.nativetask.serde.INativeSerializer; import org.apache.log4j.Logger; import java.io.DataInput; import java.io.DataOutput; import java.io.IOException; public class HivePlatform extends Platform { private static final Logger LOG = Logger.getLogger(HivePlatform.class); public HivePlatform() { } @Override public void init() throws IOException { registerKey("org.apache.hadoop.hive.ql.io.HiveKey", HiveKeySerializer.class); LOG.info("Hive platform inited"); } @Override public String name() { return "Hive"; } @Override public boolean support(String keyClassName, INativeSerializer serializer, JobConf job) { if (keyClassNames.contains(keyClassName) && serializer instanceof INativeComparable) { String nativeComparator = Constants.NATIVE_MAPOUT_KEY_COMPARATOR + "." + keyClassName; job.set(nativeComparator, "HivePlatform.HivePlatform::HiveKeyComparator"); if (job.get(Constants.NATIVE_CLASS_LIBRARY_BUILDIN) == null) { job.set(Constants.NATIVE_CLASS_LIBRARY_BUILDIN, "HivePlatform=libnativetask.so"); } return true; } else { return false; } } @Override public boolean define(Class comparatorClass) { return false; } public static class HiveKeySerializer implements INativeComparable, INativeSerializer { public HiveKeySerializer() throws ClassNotFoundException, SecurityException, NoSuchMethodException { } @Override public int getLength(HiveKey w) throws IOException { return 4 + w.getLength(); } @Override public void serialize(HiveKey w, DataOutput out) throws IOException { w.write(out); } @Override public void deserialize(DataInput in, int length, HiveKey w ) throws IOException { w.readFields(in); } } } {code} and throws exceptions: {code} Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :Native output collector cannot be loaded; at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:415) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:442) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1700) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.io.IOException: Native output collector cannot be loaded; at org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator.init(NativeMapOutputCollectorDelegator.java:165) at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402) ... 7 more Caused by: java.io.IOException: /PartitionBucket.h:56:pool is NULL, or comparator is not set /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(_ZN10NativeTask15HadoopExceptionC2ERKSs+0x76) [0x7ffdcbba6436] /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(_ZN10NativeTask18MapOutputCollector4initEjjPFiPKcjS2_jEPNS_14ICombineRunnerE+0x36a) [0x7ffdcbb9ad6a] /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(_ZN10NativeTask18MapOutputCollector9configureEPNS_6ConfigE+0x24a) [0x7ffdcbb9b37a] /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(_ZN10NativeTask23MCollectorOutputHandler9configureEPNS_6ConfigE+0x80) [0x7ffdcbb91b40] /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(_ZN10NativeTask12BatchHandler7onSetupEPNS_6ConfigEPcjS3_j+0xe9) [0x7ffdcbb90b29] /usr/local/hadoop-2.7.3-yarn/lib/native/libnativetask.so.1.0.0(Java_org_apache_had
[jira] [Assigned] (HIVE-13923) don't support DATETIME type
[ https://issues.apache.org/jira/browse/HIVE-13923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-13923: - Assignee: Dmitry Tolpeko > don't support DATETIME type > --- > > Key: HIVE-13923 > URL: https://issues.apache.org/jira/browse/HIVE-13923 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: longgeligelong >Assignee: Dmitry Tolpeko > Attachments: HIVE-13923.patch > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-16886) HMS log notifications may have duplicated event IDs if multiple HMS are running concurrently
[ https://issues.apache.org/jira/browse/HIVE-16886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-16886: --- Attachment: HIVE-16886.8.patch rebased from master [~daijy] > HMS log notifications may have duplicated event IDs if multiple HMS are > running concurrently > > > Key: HIVE-16886 > URL: https://issues.apache.org/jira/browse/HIVE-16886 > Project: Hive > Issue Type: Bug > Components: Hive, Metastore >Reporter: Sergio Peña >Assignee: anishek > Attachments: datastore-identity-holes.diff, HIVE-16886.1.patch, > HIVE-16886.2.patch, HIVE-16886.3.patch, HIVE-16886.4.patch, > HIVE-16886.5.patch, HIVE-16886.6.patch, HIVE-16886.7.patch, HIVE-16886.8.patch > > > When running multiple Hive Metastore servers and DB notifications are > enabled, I could see that notifications can be persisted with a duplicated > event ID. > This does not happen when running multiple threads in a single HMS node due > to the locking acquired on the DbNotificationsLog class, but multiple HMS > could cause conflicts. > The issue is in the ObjectStore#addNotificationEvent() method. The event ID > fetched from the datastore is used for the new notification, incremented in > the server itself, then persisted or updated back to the datastore. If 2 > servers read the same ID, then these 2 servers write a new notification with > the same ID. > The event ID is not unique nor a primary key. > Here's a test case using the TestObjectStore class that confirms this issue: > {noformat} > @Test > public void testConcurrentAddNotifications() throws ExecutionException, > InterruptedException { > final int NUM_THREADS = 2; > CountDownLatch countIn = new CountDownLatch(NUM_THREADS); > CountDownLatch countOut = new CountDownLatch(1); > HiveConf conf = new HiveConf(); > conf.setVar(HiveConf.ConfVars.METASTORE_EXPRESSION_PROXY_CLASS, > MockPartitionExpressionProxy.class.getName()); > ExecutorService executorService = > Executors.newFixedThreadPool(NUM_THREADS); > FutureTask tasks[] = new FutureTask[NUM_THREADS]; > for (int i=0; i final int n = i; > tasks[i] = new FutureTask(new Callable() { > @Override > public Void call() throws Exception { > ObjectStore store = new ObjectStore(); > store.setConf(conf); > NotificationEvent dbEvent = > new NotificationEvent(0, 0, > EventMessage.EventType.CREATE_DATABASE.toString(), "CREATE DATABASE DB" + n); > System.out.println("ADDING NOTIFICATION"); > countIn.countDown(); > countOut.await(); > store.addNotificationEvent(dbEvent); > System.out.println("FINISH NOTIFICATION"); > return null; > } > }); > executorService.execute(tasks[i]); > } > countIn.await(); > countOut.countDown(); > for (int i = 0; i < NUM_THREADS; ++i) { > tasks[i].get(); > } > NotificationEventResponse eventResponse = > objectStore.getNextNotification(new NotificationEventRequest()); > Assert.assertEquals(2, eventResponse.getEventsSize()); > Assert.assertEquals(1, eventResponse.getEvents().get(0).getEventId()); > // This fails because the next notification has an event ID = 1 > Assert.assertEquals(2, eventResponse.getEvents().get(1).getEventId()); > } > {noformat} > The last assertion fails expecting an event ID 1 instead of 2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17426) Execution framework in hive to run tasks in parallel other than MR Tasks
[ https://issues.apache.org/jira/browse/HIVE-17426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anishek updated HIVE-17426: --- Attachment: HIVE-17426.5.patch removed graphstream and correct some comments > Execution framework in hive to run tasks in parallel other than MR Tasks > > > Key: HIVE-17426 > URL: https://issues.apache.org/jira/browse/HIVE-17426 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 >Affects Versions: 3.0.0 >Reporter: anishek >Assignee: anishek > Fix For: 3.0.0 > > Attachments: HIVE-17426.0.patch, HIVE-17426.1.patch, > HIVE-17426.2.patch, HIVE-17426.3.patch, HIVE-17426.4.patch, HIVE-17426.5.patch > > > the execution framework currently only runs MR Tasks in parallel when {{set > hive.exec.parallel=true}}. > Allow other types of tasks to run in parallel as well to support replication > scenarios in hive. TezTask / SparkTask will still not be allowed to run in > parallel. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160626#comment-16160626 ] Ferdinand Xu commented on HIVE-17261: - [~junjie], I take a further round look. One more minor comment left: In ParquetRecordReaderBase * L69, no needed for split. You can just return new ParquetInputSplit in the end. > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160599#comment-16160599 ] Hive QA commented on HIVE-17261: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886325/HIVE-17261.6.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11033 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[parquet_ppd_char] (batchId=9) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6761/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6761/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886325 - PreCommit-HIVE-Build > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-14836) Implement predicate pushing down in Vectorized Page reader
[ https://issues.apache.org/jira/browse/HIVE-14836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdinand Xu reassigned HIVE-14836: --- Assignee: Ferdinand Xu > Implement predicate pushing down in Vectorized Page reader > -- > > Key: HIVE-14836 > URL: https://issues.apache.org/jira/browse/HIVE-14836 > Project: Hive > Issue Type: Sub-task >Reporter: Ferdinand Xu >Assignee: Ferdinand Xu > > Currently we filter blocks using Predict pushing down. We should support it > in page reader as well to improve its efficiency. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160552#comment-16160552 ] Ferdinand Xu edited comment on HIVE-17261 at 9/11/17 2:09 AM: -- Thanks [~junjie] for the patch. One comment is not addressed: In ParquetRecordReaderBase.java * Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java * Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java * Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you! was (Author: ferd): Thanks Junjie Chen for the patch. One comment is not addressed: In ParquetRecordReaderBase.java * Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java * Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java * Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you! > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160552#comment-16160552 ] Ferdinand Xu edited comment on HIVE-17261 at 9/11/17 2:09 AM: -- Thanks [~junjie] for the patch. One comment is not addressed: In ParquetRecordReaderBase.java * Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java * Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java * Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you! was (Author: ferd): Thanks [~junjie] for the patch. One comment is not addressed: In ParquetRecordReaderBase.java * Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java * Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java * Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you! > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160552#comment-16160552 ] Ferdinand Xu commented on HIVE-17261: - Thanks Junjie Chen for the patch. One comment is not addressed: In ParquetRecordReaderBase.java * Please remove @ Depercated annotation since we are not using the deprecated constructor in L65 A few more comments left: In ParquetRecordReaderBase.java * Remove the unnecessary return in L131 In TestParquetRowGroupFilter.java * Since the filter is taking effect automatically within Parquet reader, we should add test cases to ensure its functionality in reader level while current tests are only focusing on the functionality of RowGroupFilter.filterRowGroups. Could you create a review board next time for review? Thank you! > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17403) Fail concatenation for unmanaged and transactional tables
[ https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160544#comment-16160544 ] Hive QA commented on HIVE-17403: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886324/HIVE-17403.2.patch {color:green}SUCCESS:{color} +1 due to 3 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11036 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[orc_merge13] (batchId=81) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) org.apache.hadoop.hive.cli.TestSparkNegativeCliDriver.testCliDriver[spark_stage_max_tasks] (batchId=241) org.apache.hadoop.hive.llap.security.TestLlapSignerImpl.testSigning (batchId=290) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6760/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6760/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6760/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886324 - PreCommit-HIVE-Build > Fail concatenation for unmanaged and transactional tables > - > > Key: HIVE-17403 > URL: https://issues.apache.org/jira/browse/HIVE-17403 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 3.0.0, 2.4.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Blocker > Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch, > HIVE-17403.2.patch > > > ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. > For unmanaged tables, file names can be anything. Hive has some assumptions > about file names which can result in data loss for unmanaged tables. > Example of this is a table/partition having 2 different files files > (part-m-0__1417075294718 and part-m-00018__1417075294718). Although both > are completely different files, hive thinks these are files generated by > separate instances of same task (because of failure or speculative > execution). Hive will end up removing this file > {code} > 2017-08-28T18:19:29,516 WARN [b27f10d5-d957-4695-ab2a-1453401793df main]: > exec.Utilities (:()) - Duplicate taskid file removed: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-1/part-m-00018__1417075294718 > with length 958510. Existing file: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-1/part-m-0__1417075294718 > with length 1123116 > {code} > DDL should restrict concatenation for unmanaged tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17261) Hive use deprecated ParquetInputSplit constructor which blocked parquet dictionary filter
[ https://issues.apache.org/jira/browse/HIVE-17261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junjie Chen updated HIVE-17261: --- Attachment: HIVE-17261.6.patch > Hive use deprecated ParquetInputSplit constructor which blocked parquet > dictionary filter > - > > Key: HIVE-17261 > URL: https://issues.apache.org/jira/browse/HIVE-17261 > Project: Hive > Issue Type: Improvement > Components: Database/Schema >Affects Versions: 2.2.0 >Reporter: Junjie Chen >Assignee: Junjie Chen > Attachments: HIVE-17261.2.patch, HIVE-17261.3.patch, > HIVE-17261.4.patch, HIVE-17261.5.patch, HIVE-17261.6.patch, HIVE-17261.diff, > HIVE-17261.patch > > > Hive use deprecated ParquetInputSplit in > [https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/io/parquet/ParquetRecordReaderBase.java#L128] > Please see interface definition in > [https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetInputSplit.java#L80] > Old interface set rowgroupoffset values which will lead to skip dictionary > filter in parquet. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17403) Fail concatenation for unmanaged and transactional tables
[ https://issues.apache.org/jira/browse/HIVE-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-17403: - Attachment: HIVE-17403.2.patch Rebased patch > Fail concatenation for unmanaged and transactional tables > - > > Key: HIVE-17403 > URL: https://issues.apache.org/jira/browse/HIVE-17403 > Project: Hive > Issue Type: Bug >Affects Versions: 1.3.0, 3.0.0, 2.4.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Blocker > Attachments: HIVE-17403.1.patch, HIVE-17403.2.patch, > HIVE-17403.2.patch > > > ALTER TABLE .. CONCATENATE should fail if the table is not managed by hive. > For unmanaged tables, file names can be anything. Hive has some assumptions > about file names which can result in data loss for unmanaged tables. > Example of this is a table/partition having 2 different files files > (part-m-0__1417075294718 and part-m-00018__1417075294718). Although both > are completely different files, hive thinks these are files generated by > separate instances of same task (because of failure or speculative > execution). Hive will end up removing this file > {code} > 2017-08-28T18:19:29,516 WARN [b27f10d5-d957-4695-ab2a-1453401793df main]: > exec.Utilities (:()) - Duplicate taskid file removed: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-1/part-m-00018__1417075294718 > with length 958510. Existing file: > file:/Users/table/part=20141120/.hive-staging_hive_2017-08-28_18-19-27_210_3381701454205724533-1/_tmp.-ext-1/part-m-0__1417075294718 > with length 1123116 > {code} > DDL should restrict concatenation for unmanaged tables. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17366) Constraint replication in bootstrap
[ https://issues.apache.org/jira/browse/HIVE-17366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-17366: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 3.0.0 Status: Resolved (was: Patch Available) Patch pushed to master. For the issue Sankar find during review, created HIVE-17497 for it. > Constraint replication in bootstrap > --- > > Key: HIVE-17366 > URL: https://issues.apache.org/jira/browse/HIVE-17366 > Project: Hive > Issue Type: New Feature > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Fix For: 3.0.0 > > Attachments: HIVE-17366.1.patch, HIVE-17366.2.patch, > HIVE-17366.3.patch, HIVE-17366.4.patch > > > Incremental constraint replication is tracked in HIVE-15705. This is to track > the bootstrap replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-17497) Constraint import may fail during incremental replication
[ https://issues.apache.org/jira/browse/HIVE-17497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai reassigned HIVE-17497: - > Constraint import may fail during incremental replication > - > > Key: HIVE-17497 > URL: https://issues.apache.org/jira/browse/HIVE-17497 > Project: Hive > Issue Type: Bug > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > > During bootstrap repl dump, we may double export constraint in both bootstrap > dump and increment dump. Consider the following sequence: > 1. Get repl_id, dump table > 2. During dump, constraint is added > 3. This constraint will be in both bootstrap dump and incremental dump > 4. incremental repl_id will be newer, so the constraint will be loaded during > incremental replication > 5. since constraint is already in bootstrap replication, we will have an > exception -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-11741) Add a new hook to run before query parse/compile
[ https://issues.apache.org/jira/browse/HIVE-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160506#comment-16160506 ] Hive QA commented on HIVE-11741: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12781292/HIVE-11741.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6759/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6759/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6759/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2017-09-10 23:13:02.151 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-6759/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2017-09-10 23:13:02.154 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 5df1540 HIVE-17480: repl dump sub dir should use UUID instead of timestamp (Tao Li, reviewed by Daniel Dai) + git clean -f -d Removing hplsql/src/test/queries/local/comparison2.sql Removing hplsql/src/test/results/local/comparison2.out.txt + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 5df1540 HIVE-17480: repl dump sub dir should use UUID instead of timestamp (Tao Li, reviewed by Daniel Dai) + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2017-09-10 23:13:02.931 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file common/src/java/org/apache/hadoop/hive/conf/HiveConf.java Hunk #1 succeeded at 2303 (offset 490 lines). patching file ql/src/java/org/apache/hadoop/hive/ql/Driver.java Hunk #1 succeeded at 72 with fuzz 1 (offset 5 lines). Hunk #2 succeeded at 440 (offset 64 lines). patching file ql/src/java/org/apache/hadoop/hive/ql/hooks/PreParseHook.java patching file ql/src/test/org/apache/hadoop/hive/ql/hooks/TestPreParseHook.java + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven DataNucleus Enhancer (version 4.1.17) for API "JDO" DataNucleus Enhancer : Classpath >> /usr/share/maven/boot/plexus-classworlds-2.x.jar ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDatabase ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MFieldSchema ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MType ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTable ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MConstraint ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MSerDeInfo ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MOrder ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MColumnDescriptor ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStringList ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MStorageDescriptor ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MPartition ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MIndex ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRole ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MRoleMap ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MGlobalPrivilege ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MDBPrivilege ENHANCED (Persistable) : org.apache.hadoop.hive.metastore.model.MTablePrivilege ENHANCED (Persi
[jira] [Commented] (HIVE-17032) HPL/SQL Comparisons are only supported with strings and integers
[ https://issues.apache.org/jira/browse/HIVE-17032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160504#comment-16160504 ] Hive QA commented on HIVE-17032: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886319/HIVE-17032.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 7 failed/errored test(s), 11034 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[ppd_windowing2] (batchId=10) org.apache.hadoop.hive.cli.TestEncryptedHDFSCliDriver.testCliDriver[encryption_insert_partition_dynamic] (batchId=167) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[unionDistinct_1] (batchId=143) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6758/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6758/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6758/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 7 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886319 - PreCommit-HIVE-Build > HPL/SQL Comparisons are only supported with strings and integers > > > Key: HIVE-17032 > URL: https://issues.apache.org/jira/browse/HIVE-17032 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17032.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Var.java: > {code} > public int compareTo(Var v) { > if (this == v) { > return 0; > } > else if (v == null) { > return -1; > } > else if (type == Type.BIGINT && v.type == Type.BIGINT) { > return ((Long)value).compareTo((Long)v.value); > } > else if (type == Type.STRING && v.type == Type.STRING) { > return ((String)value).compareTo((String)v.value); > } > return -1; > } > {code} > It's surprising that comparisons with doubles and decimals (for example) > don't work as expected. > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17031) HPL/SQL Addition/Subtraction only supported on integers, datetimes and intervals
[ https://issues.apache.org/jira/browse/HIVE-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160488#comment-16160488 ] Hive QA commented on HIVE-17031: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886317/HIVE-17031.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 11034 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6757/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6757/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6757/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886317 - PreCommit-HIVE-Build > HPL/SQL Addition/Subtraction only supported on integers, datetimes and > intervals > > > Key: HIVE-17031 > URL: https://issues.apache.org/jira/browse/HIVE-17031 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17031.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Expression.java: > {code} > public void operatorSub(HplsqlParser.ExprContext ctx) { > Var v1 = evalPop(ctx.expr(0)); > Var v2 = evalPop(ctx.expr(1)); > if (v1.value == null || v2.value == null) { > evalNull(); > } > else if (v1.type == Type.BIGINT && v2.type == Type.BIGINT) { > exec.stackPush(new Var((Long)v1.value - (Long)v2.value)); > } > else if (v1.type == Type.DATE && v2.type == Type.BIGINT) { > exec.stackPush(changeDateByInt((Date)v1.value, (Long)v2.value, false > /*subtract*/)); > } > else if (v1.type == Type.DATE && v2.type == Type.INTERVAL) { > exec.stackPush(new Var(((Interval)v2.value).dateChange((Date)v1.value, > false /*subtract*/))); > } > else if (v1.type == Type.TIMESTAMP && v2.type == Type.INTERVAL) { > exec.stackPush(new > Var(((Interval)v2.value).timestampChange((Timestamp)v1.value, false > /*subtract*/), v1.scale)); > } > else { > evalNull(); > } > } > {code} > So decimals and floating points are not considered. To be fair the docs don't > mention this as supported, but it is surprising. > Need: Test case for comparisons and equality, including nulls > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-12676) [hive+impala] Alter table Rename to + Set location in a single step
[ https://issues.apache.org/jira/browse/HIVE-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-12676: -- Component/s: (was: hpl/sql) > [hive+impala] Alter table Rename to + Set location in a single step > --- > > Key: HIVE-12676 > URL: https://issues.apache.org/jira/browse/HIVE-12676 > Project: Hive > Issue Type: Improvement >Reporter: Egmont Koblinger >Assignee: Dmitry Tolpeko >Priority: Minor > > Assume a nonstandard table location, let's say /foo/bar/table1. You might > want to rename from table1 to table2 and move the underlying data accordingly > to /foo/bar/table2. > The "alter table ... rename to ..." clause alters the table name, but in the > same step moves the data into the standard location > /user/hive/warehouse/table2. Then a subsequent "alter table ... set location > ..." can move it back to the desired location /foo/bar/table2. > This is problematic if there's any permission problem in the game, e.g. not > being able to write to /user/hive/warehouse. So it should be possible to move > the underlying data to its desired final place without intermittent places in > between. > A probably hard to discover workaround is to set the table to external, then > rename it, then set back to internal and then change its location. > It would be great to be able to do an "alter table ... rename to ... set > location ..." operation in a single step. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-12676) [hive+impala] Alter table Rename to + Set location in a single step
[ https://issues.apache.org/jira/browse/HIVE-12676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-12676: - Assignee: (was: Dmitry Tolpeko) > [hive+impala] Alter table Rename to + Set location in a single step > --- > > Key: HIVE-12676 > URL: https://issues.apache.org/jira/browse/HIVE-12676 > Project: Hive > Issue Type: Improvement >Reporter: Egmont Koblinger >Priority: Minor > > Assume a nonstandard table location, let's say /foo/bar/table1. You might > want to rename from table1 to table2 and move the underlying data accordingly > to /foo/bar/table2. > The "alter table ... rename to ..." clause alters the table name, but in the > same step moves the data into the standard location > /user/hive/warehouse/table2. Then a subsequent "alter table ... set location > ..." can move it back to the desired location /foo/bar/table2. > This is problematic if there's any permission problem in the game, e.g. not > being able to write to /user/hive/warehouse. So it should be possible to move > the underlying data to its desired final place without intermittent places in > between. > A probably hard to discover workaround is to set the table to external, then > rename it, then set back to internal and then change its location. > It would be great to be able to do an "alter table ... rename to ... set > location ..." operation in a single step. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-14417) HPL/SQL: Wrong syntax generated for Postgres INSERT
[ https://issues.apache.org/jira/browse/HIVE-14417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-14417: - Assignee: Dmitry Tolpeko > HPL/SQL: Wrong syntax generated for Postgres INSERT > --- > > Key: HIVE-14417 > URL: https://issues.apache.org/jira/browse/HIVE-14417 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.1.0 >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Minor > > HPL/SQL doesn't claim any support for Postgres so I guess I can't really > complain. But if you do connect to Postgres and try an insert, this happens: > {code} > ERROR: syntax error at or near "TABLE" at character 13 > STATEMENT: INSERT INTO TABLE pgtable VALUES > (1, 2, 'a') > {code} > This "INSERT INTO TABLE" stuff isn't used against MySQL, my guess the code > generation assume it's Hive unless it's one of the other known databases so > it inserts the non-standard Hive-ism. > This was my configuration when this happened: > {code} > > > hplsql.conn.default > myhiveconn > > > hplsql.conn.myhiveconn > > org.apache.hive.jdbc.HiveDriver;jdbc:hive2://hdp250.example.com:1 > > > hplsql.conn.pgdbconn > > org.postgresql.Driver;jdbc:postgresql://hdp250.example.com/vagrant?user=vagrant&password=vagrant > > > hplsql.conn.mydbconn > > com.mysql.jdbc.Driver;jdbc:mysql://hdp250.example.com:3306/hive;hive;vagrant > > > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-11741) Add a new hook to run before query parse/compile
[ https://issues.apache.org/jira/browse/HIVE-11741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-11741: -- Component/s: (was: hpl/sql) > Add a new hook to run before query parse/compile > > > Key: HIVE-11741 > URL: https://issues.apache.org/jira/browse/HIVE-11741 > Project: Hive > Issue Type: New Feature > Components: Parser, SQL >Affects Versions: 1.2.1 >Reporter: Guilherme Braccialli >Assignee: Guilherme Braccialli >Priority: Minor > Labels: patch > Attachments: HIVE-11741.patch > > > It would be nice to allow developers to extend hive query language, making > possible to use custom wildcards on queries. > People uses Python or R to iterate over vectors or lists and create SQL > commands, this could be implemented directly on sql syntax. > For example this python script: > >>> sql = "SELECT state, " > >>> for i in range(10): > ... sql += " sum(case when type = " + str(i) + " then value end) as > sum_of_" + str(i) + " ," > ... > >>> sql += " count(1) as total FROM table" > >>> print(sql) > Could be written directly in extended sql like this: > SELECT state, > %for id = 1 to 10% >sum(case when type = %id% then value end) as sum_of_%id%, > %end% > , count(1) as total > FROM table > GROUP BY state > This kind of extensibility can be easily added if we add a new hook after > VariableSubstitution call on Driver.compile method. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-13898) don't support add subtract multiply or divide for variable of decimal or double type
[ https://issues.apache.org/jira/browse/HIVE-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-13898: - Assignee: Dmitry Tolpeko > don't support add subtract multiply or divide for variable of decimal or > double type > > > Key: HIVE-13898 > URL: https://issues.apache.org/jira/browse/HIVE-13898 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.0.1 >Reporter: longgeligelong >Assignee: Dmitry Tolpeko > Attachments: HIVE-13898.patch > > > don't support > int + decimal > int + double > decimal + double > decimal + decimal > double + double -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (HIVE-13897) don't support comparison operator for variable of decimal or double type
[ https://issues.apache.org/jira/browse/HIVE-13897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko reassigned HIVE-13897: - Assignee: Dmitry Tolpeko > don't support comparison operator for variable of decimal or double type > > > Key: HIVE-13897 > URL: https://issues.apache.org/jira/browse/HIVE-13897 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Affects Versions: 2.0.1 >Reporter: longgeligelong >Assignee: Dmitry Tolpeko > Attachments: HIVE-13897.patch > > > decimal can't compare to decimal > decimal can't compare to double > decimal can't compare to integer > double can't compare to integer -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17032) HPL/SQL Comparisons are only supported with strings and integers
[ https://issues.apache.org/jira/browse/HIVE-17032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17032: -- Attachment: HIVE-17032.1.patch > HPL/SQL Comparisons are only supported with strings and integers > > > Key: HIVE-17032 > URL: https://issues.apache.org/jira/browse/HIVE-17032 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17032.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Var.java: > {code} > public int compareTo(Var v) { > if (this == v) { > return 0; > } > else if (v == null) { > return -1; > } > else if (type == Type.BIGINT && v.type == Type.BIGINT) { > return ((Long)value).compareTo((Long)v.value); > } > else if (type == Type.STRING && v.type == Type.STRING) { > return ((String)value).compareTo((String)v.value); > } > return -1; > } > {code} > It's surprising that comparisons with doubles and decimals (for example) > don't work as expected. > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17032) HPL/SQL Comparisons are only supported with strings and integers
[ https://issues.apache.org/jira/browse/HIVE-17032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17032: -- Status: Patch Available (was: Open) > HPL/SQL Comparisons are only supported with strings and integers > > > Key: HIVE-17032 > URL: https://issues.apache.org/jira/browse/HIVE-17032 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17032.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Var.java: > {code} > public int compareTo(Var v) { > if (this == v) { > return 0; > } > else if (v == null) { > return -1; > } > else if (type == Type.BIGINT && v.type == Type.BIGINT) { > return ((Long)value).compareTo((Long)v.value); > } > else if (type == Type.STRING && v.type == Type.STRING) { > return ((String)value).compareTo((String)v.value); > } > return -1; > } > {code} > It's surprising that comparisons with doubles and decimals (for example) > don't work as expected. > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17031) HPL/SQL Addition/Subtraction only supported on integers, datetimes and intervals
[ https://issues.apache.org/jira/browse/HIVE-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17031: -- Status: Patch Available (was: Open) > HPL/SQL Addition/Subtraction only supported on integers, datetimes and > intervals > > > Key: HIVE-17031 > URL: https://issues.apache.org/jira/browse/HIVE-17031 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17031.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Expression.java: > {code} > public void operatorSub(HplsqlParser.ExprContext ctx) { > Var v1 = evalPop(ctx.expr(0)); > Var v2 = evalPop(ctx.expr(1)); > if (v1.value == null || v2.value == null) { > evalNull(); > } > else if (v1.type == Type.BIGINT && v2.type == Type.BIGINT) { > exec.stackPush(new Var((Long)v1.value - (Long)v2.value)); > } > else if (v1.type == Type.DATE && v2.type == Type.BIGINT) { > exec.stackPush(changeDateByInt((Date)v1.value, (Long)v2.value, false > /*subtract*/)); > } > else if (v1.type == Type.DATE && v2.type == Type.INTERVAL) { > exec.stackPush(new Var(((Interval)v2.value).dateChange((Date)v1.value, > false /*subtract*/))); > } > else if (v1.type == Type.TIMESTAMP && v2.type == Type.INTERVAL) { > exec.stackPush(new > Var(((Interval)v2.value).timestampChange((Timestamp)v1.value, false > /*subtract*/), v1.scale)); > } > else { > evalNull(); > } > } > {code} > So decimals and floating points are not considered. To be fair the docs don't > mention this as supported, but it is surprising. > Need: Test case for comparisons and equality, including nulls > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-17031) HPL/SQL Addition/Subtraction only supported on integers, datetimes and intervals
[ https://issues.apache.org/jira/browse/HIVE-17031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmitry Tolpeko updated HIVE-17031: -- Attachment: HIVE-17031.1.patch > HPL/SQL Addition/Subtraction only supported on integers, datetimes and > intervals > > > Key: HIVE-17031 > URL: https://issues.apache.org/jira/browse/HIVE-17031 > Project: Hive > Issue Type: Bug > Components: hpl/sql >Reporter: Carter Shanklin >Assignee: Dmitry Tolpeko >Priority: Critical > Attachments: HIVE-17031.1.patch > > > This bug is part of a series of issues and surprising behavior I encountered > writing a reporting script that would aggregate values and give rows > different classifications based on an the aggregate. Addressing some or all > of these issues would make HPL/SQL more accessible to newcomers. > In Expression.java: > {code} > public void operatorSub(HplsqlParser.ExprContext ctx) { > Var v1 = evalPop(ctx.expr(0)); > Var v2 = evalPop(ctx.expr(1)); > if (v1.value == null || v2.value == null) { > evalNull(); > } > else if (v1.type == Type.BIGINT && v2.type == Type.BIGINT) { > exec.stackPush(new Var((Long)v1.value - (Long)v2.value)); > } > else if (v1.type == Type.DATE && v2.type == Type.BIGINT) { > exec.stackPush(changeDateByInt((Date)v1.value, (Long)v2.value, false > /*subtract*/)); > } > else if (v1.type == Type.DATE && v2.type == Type.INTERVAL) { > exec.stackPush(new Var(((Interval)v2.value).dateChange((Date)v1.value, > false /*subtract*/))); > } > else if (v1.type == Type.TIMESTAMP && v2.type == Type.INTERVAL) { > exec.stackPush(new > Var(((Interval)v2.value).timestampChange((Timestamp)v1.value, false > /*subtract*/), v1.scale)); > } > else { > evalNull(); > } > } > {code} > So decimals and floating points are not considered. To be fair the docs don't > mention this as supported, but it is surprising. > Need: Test case for comparisons and equality, including nulls > Version = 3.0.0-SNAPSHOT r71f52d8ad512904b3f2c4f04fe39a33f2834f1f2 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160444#comment-16160444 ] Xuefu Zhang commented on HIVE-7292: --- [~bastrich], the answer is no. However, if there is a strong demand, Mesos support can be added. > Hive on Spark > - > > Key: HIVE-7292 > URL: https://issues.apache.org/jira/browse/HIVE-7292 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 > Attachments: Hive-on-Spark.pdf > > > Spark as an open-source data analytics cluster computing framework has gained > significant momentum recently. Many Hive users already have Spark installed > as their computing backbone. To take advantages of Hive, they still need to > have either MapReduce or Tez on their cluster. This initiative will provide > user a new alternative so that those user can consolidate their backend. > Secondly, providing such an alternative further increases Hive's adoption as > it exposes Spark users to a viable, feature-rich de facto standard SQL tools > on Hadoop. > Finally, allowing Hive to run on Spark also has performance benefits. Hive > queries, especially those involving multiple reducer stages, will run faster, > thus improving user experience as Tez does. > This is an umbrella JIRA which will cover many coming subtask. Design doc > will be attached here shortly, and will be on the wiki as well. Feedback from > the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-7292) Hive on Spark
[ https://issues.apache.org/jira/browse/HIVE-7292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160363#comment-16160363 ] Daniil Bastrich commented on HIVE-7292: --- Does anybody know if Hive supports Spark on Mesos? > Hive on Spark > - > > Key: HIVE-7292 > URL: https://issues.apache.org/jira/browse/HIVE-7292 > Project: Hive > Issue Type: Improvement > Components: Spark >Reporter: Xuefu Zhang >Assignee: Xuefu Zhang > Labels: Spark-M1, Spark-M2, Spark-M3, Spark-M4, Spark-M5 > Attachments: Hive-on-Spark.pdf > > > Spark as an open-source data analytics cluster computing framework has gained > significant momentum recently. Many Hive users already have Spark installed > as their computing backbone. To take advantages of Hive, they still need to > have either MapReduce or Tez on their cluster. This initiative will provide > user a new alternative so that those user can consolidate their backend. > Secondly, providing such an alternative further increases Hive's adoption as > it exposes Spark users to a viable, feature-rich de facto standard SQL tools > on Hadoop. > Finally, allowing Hive to run on Spark also has performance benefits. Hive > queries, especially those involving multiple reducer stages, will run faster, > thus improving user experience as Tez does. > This is an umbrella JIRA which will cover many coming subtask. Design doc > will be attached here shortly, and will be on the wiki as well. Feedback from > the community is greatly appreciated! -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HIVE-17366) Constraint replication in bootstrap
[ https://issues.apache.org/jira/browse/HIVE-17366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16160251#comment-16160251 ] Hive QA commented on HIVE-17366: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12886289/HIVE-17366.4.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 11033 tests executed *Failed tests:* {noformat} TestAccumuloCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=230) TestDummy - did not produce a TEST-*.xml file (likely timed out) (batchId=230) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[insert_values_orig_table_use_metadata] (batchId=61) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=100) org.apache.hadoop.hive.cli.TestPerfCliDriver.testCliDriver[query14] (batchId=234) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/6756/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/6756/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-6756/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12886289 - PreCommit-HIVE-Build > Constraint replication in bootstrap > --- > > Key: HIVE-17366 > URL: https://issues.apache.org/jira/browse/HIVE-17366 > Project: Hive > Issue Type: New Feature > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-17366.1.patch, HIVE-17366.2.patch, > HIVE-17366.3.patch, HIVE-17366.4.patch > > > Incremental constraint replication is tracked in HIVE-15705. This is to track > the bootstrap replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)