[jira] [Created] (HIVE-23347) MSCK REPAIR cannot discover partitions with upper case directory names.
Sankar Hariappan created HIVE-23347: --- Summary: MSCK REPAIR cannot discover partitions with upper case directory names. Key: HIVE-23347 URL: https://issues.apache.org/jira/browse/HIVE-23347 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.0 Reporter: Sankar Hariappan Assignee: Adesh Kumar Rao For the following scenario, we expect MSCK REPAIR to discover partitions but it couldn't. 1. Have partitioned data path as follows. hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=10 hdfs://mycluster/datapath/t1/Year=2020/Month=03/Day=11 2. create external table t1 (key int, value string) partitioned by (Year int, Month int, Day int) stored as orc location hdfs://mycluster/datapath/t1''; 3. msck repair table t1; 4. show partitions t1; --> Returns zero partitions 5. select * from t1; --> Returns empty data. When the partition directory names are changed to lower case, this works fine. hdfs://mycluster/datapath/t1/year=2020/month=03/day=10 hdfs://mycluster/datapath/t1/year=2020/month=03/day=11 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (HIVE-21992) REPL DUMP throws NPE when dumping Create Function event.
Sankar Hariappan created HIVE-21992: --- Summary: REPL DUMP throws NPE when dumping Create Function event. Key: HIVE-21992 URL: https://issues.apache.org/jira/browse/HIVE-21992 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan REPL DUMP throws NPE while dumping Create Function event.It seems, null check is missing for function.getResourceUris(). (code} java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.repl.dump.io.FunctionSerializer.writeTo(FunctionSerializer.java:54) at org.apache.hadoop.hive.ql.parse.repl.dump.events.CreateFunctionHandler.handle(CreateFunctionHandler.java:48) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpEvent(ReplDumpTask.java:304) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2727) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2394) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2066) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1758) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) FAILED: Execution Error, return code 4 from org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask. java.lang.NullPointerException at org.apache.hadoop.hive.ql.parse.repl.dump.io.FunctionSerializer.writeTo(FunctionSerializer.java:54) at org.apache.hadoop.hive.ql.parse.repl.dump.events.CreateFunctionHandler.handle(CreateFunctionHandler.java:48) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.dumpEvent(ReplDumpTask.java:304) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2727) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2394) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2066) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1764) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1758) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at
[jira] [Created] (HIVE-21951) Llap query on external table with header or footer returns incorrect row count.
Sankar Hariappan created HIVE-21951: --- Summary: Llap query on external table with header or footer returns incorrect row count. Key: HIVE-21951 URL: https://issues.apache.org/jira/browse/HIVE-21951 Project: Hive Issue Type: Bug Components: llap, Query Processor Affects Versions: 2.4.0, 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan If create a table with header and footer as follows. {code} CREATE EXTERNAL TABLE IF NOT EXISTS externaltableOpenCSV (eid int, name String, salary String, destination String) COMMENT 'Employee details' ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde' STORED AS TEXTFILE LOCATION '/externaltableOpenCSV' tblproperties ("skip.header.line.count"="1", "skip.footer.line.count"="2"); {code} Now, query on this table returns incorrect row count as header/footer are not skipped. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21893) Handle concurrent writes when ACID tables are getting bootst
Sankar Hariappan created HIVE-21893: --- Summary: Handle concurrent writes when ACID tables are getting bootst Key: HIVE-21893 URL: https://issues.apache.org/jira/browse/HIVE-21893 Project: Hive Issue Type: Bug Reporter: Sankar Hariappan -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21880) Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
Sankar Hariappan created HIVE-21880: --- Summary: Enable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites. Key: HIVE-21880 URL: https://issues.apache.org/jira/browse/HIVE-21880 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Ashutosh Bapat Need tp enable TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites which is disabled as it is flaky and randomly failing with below error. {code} Error Message Notification events are missing in the meta store. Stacktrace java.lang.IllegalStateException: Notification events are missing in the meta store. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289) at org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at
[jira] [Created] (HIVE-21879) Disable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
Sankar Hariappan created HIVE-21879: --- Summary: Disable flaky test TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites. Key: HIVE-21879 URL: https://issues.apache.org/jira/browse/HIVE-21879 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites is flaky and fails randomly with below error. {code} Error Message Notification events are missing in the meta store. Stacktrace java.lang.IllegalStateException: Notification events are missing in the meta store. at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212) at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159) at org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231) at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162) at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265) at org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289) at org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.rules.RunRules.evaluate(RunRules.java:20) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at
[jira] [Created] (HIVE-21811) Load data into partitioned table throws NPE if DB is enabled for replication.
Sankar Hariappan created HIVE-21811: --- Summary: Load data into partitioned table throws NPE if DB is enabled for replication. Key: HIVE-21811 URL: https://issues.apache.org/jira/browse/HIVE-21811 Project: Hive Issue Type: Bug Components: Standalone Metastore Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan When load data into a partitioned table with hive.autogather.stats=true and DB is enabled for replication, throws NPE. Here is the call stack. {code} 0: jdbc:hive2://ctr-e139-1542663976389-126983> LOAD DATA INPATH '/tmp/traffic_data/traffic_data-QUEENS.csv' INTO TABLE traffic_data partition (county='QUEENS'); INFO : Loading data to table traffic_database.traffic_data partition (county=QUEENS) from hdfs://ctr-e139-1542663976389-126983-01-03.hwx.site:8020/tmp/traffic_data/traffic_data-QUEENS.csv INFO : Partition traffic_database.traffic_data{county=QUEENS} stats: [numFiles=1, numRows=0, totalSize=64398392, rawDataSize=0] INFO : [Warning] could not update stats.Failed with exception Unable to alter partition. java.lang.NullPointerException org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter partition. java.lang.NullPointerException at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:678) at org.apache.hadoop.hive.ql.exec.StatsTask.aggregateStats(StatsTask.java:261) at org.apache.hadoop.hive.ql.exec.StatsTask.execute(StatsTask.java:122) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:177) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:96) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1777) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1511) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1308) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1175) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:273) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: MetaException(message:java.lang.NullPointerException) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:6161) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:3908) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy20.alter_partitions_with_environment_context(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.alter_partitions(HiveMetaStoreClient.java:1485) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) at com.sun.proxy.$Proxy21.alter_partitions(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.alterPartitions(Hive.java:676) ... 23 more Caused by: java.lang.NullPointerException at
[jira] [Created] (HIVE-21764) REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included.
Sankar Hariappan created HIVE-21764: --- Summary: REPL DUMP should detect and bootstrap any rename table events where old table was excluded but renamed table is included. Key: HIVE-21764 URL: https://issues.apache.org/jira/browse/HIVE-21764 Project: Hive Issue Type: Sub-task Components: repl Reporter: Sankar Hariappan Assignee: Sankar Hariappan REPL DUMP fetches the events from NOTIFICATION_LOG table based on regular expression + inclusion/exclusion list. So, in case of rename table event, the event will be ignored if old table doesn't match the pattern but the new table should be bootstrapped. REPL DUMP should have a mechanism to detect such tables and automatically bootstrap with incremental replication. Also, if renamed table is excluded from replication policy, then need to drop the old table at target as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21762) REPL DUMP to support new format for replication policy input to take included tables list.
Sankar Hariappan created HIVE-21762: --- Summary: REPL DUMP to support new format for replication policy input to take included tables list. Key: HIVE-21762 URL: https://issues.apache.org/jira/browse/HIVE-21762 Project: Hive Issue Type: Sub-task Components: repl Reporter: Sankar Hariappan Assignee: Sankar Hariappan - REPL DUMP syntax: {code} REPL DUMP [FROM WITH ; {code} - New format for the Replication policy have 3 parts all separated with Dot (.). 1. First part is DB name. 2. Second part is included list. Comma separated table names/regex with in square brackets[]. 3. Third part is excluded list. Comma separated table names/regex with in square brackets[]. {code} -- Full DB replication which is currently supported. .*-- Full DB replication .t1 -- DB level replication with just one table t1. .[t1, t2] -- DB replication with static list of tables t1 and t2 included. .[t1*, t2], *t3].[t100, 5t3, t4] -- DB replication with all tables having prefix t1, with suffix t3 and include table t2and exclude t100 which has the prefix t1, 5t3 which suffix t3 and t4. {code} - Need to support regular expression of any format. - A table is included in dump only if it matches the regular expressions in included list and doesn't match the excluded list. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21763) Incremental replication to allow changing include/exclude tables list in replication policy.
Sankar Hariappan created HIVE-21763: --- Summary: Incremental replication to allow changing include/exclude tables list in replication policy. Key: HIVE-21763 URL: https://issues.apache.org/jira/browse/HIVE-21763 Project: Hive Issue Type: Sub-task Components: repl Reporter: Sankar Hariappan Assignee: Sankar Hariappan - REPL DUMP takes 2 inputs along with existing FROM and WITH clause. {code} - REPL DUMP [REPLACE FROM WITH ; - current_repl_policy and previous_repl_policy can be any format mentioned in Point-4. - REPLACE clause to be supported to take previous repl policy as input. If REPLACE clause is not there, then the policy remains unchanged. - Rest of the format remains same. {code} - Now, REPL DUMP on this DB will replicate the tables based on current_repl_policy. - If any table is added dynamically either due to change in regular expression or added to include list should be bootstrapped using independant table level replication policy. {code} - Hive will automatically figure out the list of tables newly included in the list by comparing the current_repl_policy & previous_repl_policy inputs and combine bootstrap dump for added tables as part of incremental dump. "_bootstrap" directory can be created in dump dir to accommodate all tables to be bootstrapped. - If any table is renamed, then it may gets dynamically added/removed for replication based on defined replication policy + include/exclude list. So, Hive will perform bootstrap for the table which is just included after rename. {code} - REPL LOAD on incremental dump should check for "_bootstrap" directory and perform bootstrap load on them first and then continue with incremental load based on events directories. - REPL LOAD should check for changes in repl policy and drop the tables/views excluded in the new policy compared to previous policy. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21761) Support table level replication in Hive
Sankar Hariappan created HIVE-21761: --- Summary: Support table level replication in Hive Key: HIVE-21761 URL: https://issues.apache.org/jira/browse/HIVE-21761 Project: Hive Issue Type: New Feature Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan *Requirements:* - User needs to define replication policy to replicate any specific table. This enables user to replicate only the business critical tables instead of replicating all tables which may throttle the network bandwidth, storage and also slow-down Hive replication. - User needs to define replication policy using regular expressions (such as db.sales_*) and needs to include additional tables which are non-matching given pattern and exclude some tables which are matching given pattern. - User needs to dynamically add/remove tables to the list either by manually changing the replication policy during run time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21730) HiveStatement.getQueryId throws TProtocolException when response is null.
Sankar Hariappan created HIVE-21730: --- Summary: HiveStatement.getQueryId throws TProtocolException when response is null. Key: HIVE-21730 URL: https://issues.apache.org/jira/browse/HIVE-21730 Project: Hive Issue Type: Bug Components: HiveServer2 Reporter: Sankar Hariappan Assignee: Sankar Hariappan HiveStatement.getQueryId is failing with below exception. {code} 24256 2019-05-14T02:09:01,355 INFO [HiveServer2-Background-Pool: Thread-1829] ql.Driver: Executing command(queryId=hive_20190514020858_530a33d9-0b19-4f72-ae08-b631fb4749cb): create table household_demographics 24257 stored as orc as 24258 select * from household_demographics_txt 24259 2019-05-14T02:09:01,356 INFO [HiveServer2-Background-Pool: Thread-1829] hooks.HiveProtoLoggingHook: Received pre-hook notification for: hive_20190514020858_530a33d9-0b19-4f72-ae08-b631fb4749cb 24260 2019-05-14T02:09:01,356 ERROR [HiveServer2-Handler-Pool: Thread-131] server.TThreadPoolServer: Thrift error occurred during processing of message. 24261 org.apache.thrift.protocol.TProtocolException: Required field 'queryId' is unset! Struct:TGetQueryIdResp(queryId:null) 24216,1 10% 24260 2019-05-14T02:09:01,356 ERROR [HiveServer2-Handler-Pool: Thread-131] server.TThreadPoolServer: Thrift error occurred during processing of message. 24261 org.apache.thrift.protocol.TProtocolException: Required field 'queryId' is unset! Struct:TGetQueryIdResp(queryId:null) 24262 at org.apache.hive.service.rpc.thrift.TGetQueryIdResp.validate(TGetQueryIdResp.java:294) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24263 at org.apache.hive.service.rpc.thrift.TCLIService$GetQueryId_result.validate(TCLIService.java:18890) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24264 at org.apache.hive.service.rpc.thrift.TCLIService$GetQueryId_result$GetQueryId_resultStandardScheme.write(TCLIService.java:18947) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24265 at org.apache.hive.service.rpc.thrift.TCLIService$GetQueryId_result$GetQueryId_resultStandardScheme.write(TCLIService.java:18916) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24266 at org.apache.hive.service.rpc.thrift.TCLIService$GetQueryId_result.write(TCLIService.java:18867) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24267 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:53) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24268 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24269 at org.apache.hive.service.auth.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:56) ~[hive-service-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24270 at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) [hive-exec-2.1.0.2.6.5.1150-19.jar:2.1.0.2.6.5.1150-19] 24271 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161] 24272 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161] 24273 at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21706) REPL Dump with concurrent drop of external table fails.
Sankar Hariappan created HIVE-21706: --- Summary: REPL Dump with concurrent drop of external table fails. Key: HIVE-21706 URL: https://issues.apache.org/jira/browse/HIVE-21706 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan During REPL DUMP of a DB having external tables, if any of the external table is dropped concurrently, then REPL DUMP fails with below exception. {code} 2019-05-08T11:57:25,702 ERROR [HiveServer2-Background-Pool: Thread-3307]: repl.ReplDumpTask (:()) - failed org.apache.hadoop.hive.ql.metadata.HiveException: Unable to fetch table catalog_sales_new. null at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1387) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1336) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1316) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:1298) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:259) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2711) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2382) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2054) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1752) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1746) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) ~[hive-service-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:324) ~[hive-service-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_181] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_181] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) ~[hadoop-common-3.1.1.3.1.0.31-12.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:342) ~[hive-service-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_181] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_181] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181] Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:376) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:453) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:435) ~[hive-exec-3.1.0.3.1.0.31-12.jar:3.1.0.3.1.0.31-12] at org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:37)
[jira] [Created] (HIVE-21671) Replicate Streaming ingestion with transactional batch size as 1.
Sankar Hariappan created HIVE-21671: --- Summary: Replicate Streaming ingestion with transactional batch size as 1. Key: HIVE-21671 URL: https://issues.apache.org/jira/browse/HIVE-21671 Project: Hive Issue Type: Sub-task Components: repl, Streaming, Transactions Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Replication streaming ingest HiveStreamingConnection on ACID tables with transaction batch size as 1. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21669) HS2 throws NPE when HiveStatement.getQueryId is invoked and query is closed concurrently.
Sankar Hariappan created HIVE-21669: --- Summary: HS2 throws NPE when HiveStatement.getQueryId is invoked and query is closed concurrently. Key: HIVE-21669 URL: https://issues.apache.org/jira/browse/HIVE-21669 Project: Hive Issue Type: Bug Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan HS2 throws NullPointerException if HiveStatement.getQueryId invoked without executing any query or query is closed. It should instead return null so that caller would check it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21654) External table location is not preserved at target when base dir is set as /.
Sankar Hariappan created HIVE-21654: --- Summary: External table location is not preserved at target when base dir is set as /. Key: HIVE-21654 URL: https://issues.apache.org/jira/browse/HIVE-21654 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan External table location is not preserved same as source path when base directory is set as "/". Source path: /tmp/ext/src/db1/ext1 Target path: /ext/src/db1/ext1 --> It should be /tmp/ext/src/db1/ext1 itself. External table base dir supplied: / If the base dir input is changed to "/abc", then target path is set as "/abc/tmp/ext/src/db1/ext1 which is correct. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21602) Dropping an external table created by migration case should delete the data directory.
Sankar Hariappan created HIVE-21602: --- Summary: Dropping an external table created by migration case should delete the data directory. Key: HIVE-21602 URL: https://issues.apache.org/jira/browse/HIVE-21602 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan For external table, if the table is dropped, the location is not removed. But If the source table is managed and at target the table is converted to external, then the table location should be removed if the table is dropped. Replication flow should set additional parameter "external.table.purge"="true" for migration to external table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21564) Load data into a bucketed table is ignoring partitions specs and loading data into default partition.
Sankar Hariappan created HIVE-21564: --- Summary: Load data into a bucketed table is ignoring partitions specs and loading data into default partition. Key: HIVE-21564 URL: https://issues.apache.org/jira/browse/HIVE-21564 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan When running below command to load data into bucketed tables it is not loading into specified partition instead loaded into default partition. LOAD DATA INPATH '/tmp/files/00_0' OVERWRITE INTO TABLE call PARTITION(year_partition=2012, month=12); SELECT * FROM call WHERE year_partition=2012 AND month=12; --> returns 0 rows. {code} CREATE TABLE call( date_time_date date, ssn string, name string, location string) PARTITIONED BY ( year_partition int, month int) CLUSTERED BY ( date_time_date) SORTED BY ( date_time_date ASC) INTO 1 BUCKETS STORED AS ORC; {code} If set hive.exec.dynamic.partition to false, it fails with below error. {code} Error: Error while compiling statement: FAILED: SemanticException 1:18 Dynamic partition is disabled. Either enable it by setting hive.exec.dynamic.partition=true or specify partition column values. Error encountered near token 'month' (state=42000,code=4) {code} When we "set hive.strict.checks.bucketing=false;", the load works fine. This is a behaviour imposed by HIVE-15148 to avoid incorrectly named data files being loaded to the bucketed tables. In customer use case, if the files are named properly with bucket_id (0_0, 0_1 etc), then it is safe to set this flag to false. However, current behaviour of loading into default partitions when hive.strict.checks.bucketing=true and partitions specified, was a bug injected by HIVE-19311 where the given query is re-written into a insert query (to handle incorrect file names and Orc versions) but missed to incorporate the partitions specs to it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21540) Query with join condition having date literal throws SemanticException.
Sankar Hariappan created HIVE-21540: --- Summary: Query with join condition having date literal throws SemanticException. Key: HIVE-21540 URL: https://issues.apache.org/jira/browse/HIVE-21540 Project: Hive Issue Type: Bug Components: Query Planning Affects Versions: 3.1.0, 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan This semantic exception is thrown for the following query. *SemanticException '2019-03-20' encountered with 0 children* {code} create table date_1 (key int, dd date); create table date_2 (key int, dd date); select d1.key, d2.dd from( select key, dd as start_dd, current_date as end_dd from date_1) d1 join date_2 as d2 on d1.key = d2.key where d2.dd between start_dd and end_dd; {code} When the WHERE condition below is commented out, the query completes successfully. where d2.dd between start_dd and end_dd -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21530) Replicate Streaming ingest on ACID tables.
Sankar Hariappan created HIVE-21530: --- Summary: Replicate Streaming ingest on ACID tables. Key: HIVE-21530 URL: https://issues.apache.org/jira/browse/HIVE-21530 Project: Hive Issue Type: Sub-task Components: repl, Transactions Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: mahesh kumar behera Attachments: Hive ACID Replication_ Streaming Ingest Tables.pdf implement replication of hive streaming ingest of tables as per [^Hive ACID Replication_ Streaming Ingest Tables.pdf] . changes to txn_commit to include information about transaction batch. changes to copy task to only copy if there is a difference in file size or checksum, seems specific to transaction batch shouldnt be used for normal transactions. copy the correct sequence of files w.r.t data file + side file. remove side files ( which looks like are suffixed as _flush in file names) when the batch is committed. how do we determine the idempotent nature of the events here, update the corresponding table + partition and not copy new version of the file. validate if partial copied data files are handled on the target warehouse given correct side file. can we leave the side file file forever, in case during transaction batch copy after certain transactions are copied over then primary warehouse fails. we wont be able to remove _flush file, on failover do we have to handle this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21529) Hive support bootstrap of ACID/MM tables on an existing policy.
Sankar Hariappan created HIVE-21529: --- Summary: Hive support bootstrap of ACID/MM tables on an existing policy. Key: HIVE-21529 URL: https://issues.apache.org/jira/browse/HIVE-21529 Project: Hive Issue Type: Sub-task Components: repl, Transactions Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Ashutosh Bapat If ACID/MM tables to be enabled (hive.repl.dump.include.acid.tables) on an existing repl policy, then need to combine bootstrap dump of these tables along with the ongoing incremental dump. Shall add a one time config "hive.repl.bootstrap.acid.tables" to include bootstrap in the given dump. Also, need to support hive.repl.bootstrap.cleanup.type for ACID tables to clean-up partially bootstrapped tables in case of retry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21500) Support converting managed ACID table to external if the corresponding non-ACID table is converted to external at source.
Sankar Hariappan created HIVE-21500: --- Summary: Support converting managed ACID table to external if the corresponding non-ACID table is converted to external at source. Key: HIVE-21500 URL: https://issues.apache.org/jira/browse/HIVE-21500 Project: Hive Issue Type: Sub-task Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan For the below scenario of Hive2 to Hive3 replication (with strict managed=true), the managed ACID table at target should be converted to external table. 1. Create non-ACID ORC format table. 2. Insert some rows 3. Replicate this create event which creates ACID table at target (due to migration rule). Each insert event adds metadata in HMS corresponding to the current table. 4. Convert table to external table using ALTER command. 5. Replicating this alter event should convert ACID table to external table and make sure corresponding metadata are removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21471) Replicating conversion of managed to external table leaks HDFS files at target.
Sankar Hariappan created HIVE-21471: --- Summary: Replicating conversion of managed to external table leaks HDFS files at target. Key: HIVE-21471 URL: https://issues.apache.org/jira/browse/HIVE-21471 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan While replicating the ALTER event to convert managed table to external table, the data location for the table is changed under input base directory for external tables replication. But, the old location remains there and would be leaked for ever. ALTER TABLE T1 SET TBLPROPERTIES('EXTERNAL'='true'); -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21421) HiveStatement.getQueryId throws NPE when query is not running.
Sankar Hariappan created HIVE-21421: --- Summary: HiveStatement.getQueryId throws NPE when query is not running. Key: HIVE-21421 URL: https://issues.apache.org/jira/browse/HIVE-21421 Project: Hive Issue Type: Bug Components: JDBC Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan HiveStatement.getQueryId throws NullPointerException if it invoked without executing any query or query is closed. It should instead return null so that caller would check it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21403) Incorrect error code returned when retry bootstrap with different dump.
Sankar Hariappan created HIVE-21403: --- Summary: Incorrect error code returned when retry bootstrap with different dump. Key: HIVE-21403 URL: https://issues.apache.org/jira/browse/HIVE-21403 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan When retry incremental bootstrap on a table with different bootstrap dump throws 4 as error code instead of 20017. {code} Error while processing statement: FAILED: Execution Error, return code 4 from org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask. InvalidOperationException(message:Load path hdfs://ctr-e139-1542663976389-61669-01-03.hwx.site:8020/apps/hive/repl/3d704b34-bf1a-40c9-b70c-57319e6462f6 not valid as target database is bootstrapped from some other path : hdfs://ctr-e139-1542663976389-61669-01-03.hwx.site:8020/apps/hive/repl/c3e5ec9e-d951-48aa-b3f4-9aeaf5e010ea.) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21307) Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY.
Sankar Hariappan created HIVE-21307: --- Summary: Need to set GzipJSONMessageEncoder as default config for EVENT_MESSAGE_FACTORY. Key: HIVE-21307 URL: https://issues.apache.org/jira/browse/HIVE-21307 Project: Hive Issue Type: Bug Components: Configuration, repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, we use JsonMessageEncoder as the default message factory for Notification events. As the size of some of the events are really huge and cause OOM issues in RDBMS. So, it is needed to enable GzipJSONMessageEncoder as default message factory to optimise the memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21286) Hive should support clean-up of incrementally bootstrapped tables when retry from different dump.
Sankar Hariappan created HIVE-21286: --- Summary: Hive should support clean-up of incrementally bootstrapped tables when retry from different dump. Key: HIVE-21286 URL: https://issues.apache.org/jira/browse/HIVE-21286 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan If external tables are enabled for replication on an existing repl policy, then bootstrapping of external tables are combined with incremental dump. If incremental bootstrap load fails with non-retryable error for which user will have to manually drop all the external tables before trying with another bootstrap dump. For full bootstrap, to retry with different dump, we suggested user to drop the DB but in this case they need to manually drop all the external tables which is not so user friendly. So, need to handle it in Hive side as follows. REPL LOAD takes additional config (passed by user in WITH clause) that says, drop all the tables which are part of this bootstrap dump. There are 4 cases possible. 1. Only external tables - Drop all external tables before triggering bootstrap load. 2. Only ACID/MM tables - Drop all ACID/MM tables before triggering bootstrap load. 3. Both external and ACID/MM tables - Drop both external and ACID/MM tables before triggering bootstrap load. 3. Table level replication with bootstrap - Drop all the tables that match the diff in previous and current repl policy (pattern+include/exclude list) before triggering bootstrap load. Configuration: hive.repl.bootstrap.cleanup.type= {1=external_tables, 2=transactional_tables, 3=external_and_transactional_tables, 4=table_level} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21281) Repl checkpointing doesn't work while retry bootstrap load with partitions of external tables.
Sankar Hariappan created HIVE-21281: --- Summary: Repl checkpointing doesn't work while retry bootstrap load with partitions of external tables. Key: HIVE-21281 URL: https://issues.apache.org/jira/browse/HIVE-21281 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Repl checkpoint feature optimises the retry logic of bootstrap repl load by skipping the properly loaded tables and partitions. In case of retry of bootstrap load with external tables having partitions, the checkpoint doesn't work and load partitions always. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21269) Hive replication should mandate -update and -delete as DistCp options to avoid data inconsistency.
Sankar Hariappan created HIVE-21269: --- Summary: Hive replication should mandate -update and -delete as DistCp options to avoid data inconsistency. Key: HIVE-21269 URL: https://issues.apache.org/jira/browse/HIVE-21269 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, external tables replication, copies the data in directory level. So, if target directory exist, then DistCp should compare and update or skip data files in the directory instead of creating new directory inside pre-existing target directory. This can be achieved using -update. Also, -delete option is needed to delete the files missing in source directory but present in target. Hive should mandate these DistCp options even if user passes other options. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21261) Incremental replication adds redundant COPY and MOVE tasks for external table events.
Sankar Hariappan created HIVE-21261: --- Summary: Incremental replication adds redundant COPY and MOVE tasks for external table events. Key: HIVE-21261 URL: https://issues.apache.org/jira/browse/HIVE-21261 Project: Hive Issue Type: Improvement Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan For external tables replication, the data gets copied as separate task based on data locations listed in _external_tables_info file in the dump. So, individual events such as ADD_PARTITION or INSERT on the external tables should avoid copying data. So, it is enough to create table/add partition DDL tasks. COPY and MOVE tasks should be skipped. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21206) Bootstrap replication to target with strict managed table enabled is running slow.
Sankar Hariappan created HIVE-21206: --- Summary: Bootstrap replication to target with strict managed table enabled is running slow. Key: HIVE-21206 URL: https://issues.apache.org/jira/browse/HIVE-21206 Project: Hive Issue Type: Sub-task Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Hive bootstrap replication of 1TB data onprem to onprem (Hive2(Strict_Managed=false) to Hive3(Strict_Managed=true)) is running slower Time taken for replications are as below: ||Hive2- Hive2|| Hive3 - Hive3 || |Bootstrap: 01h27m| BootStrap: 03h45m | -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-21186) REPL LOAD for external tables throws NPE if relative path is set for hive.repl.replica.external.table.base.dir.
Sankar Hariappan created HIVE-21186: --- Summary: REPL LOAD for external tables throws NPE if relative path is set for hive.repl.replica.external.table.base.dir. Key: HIVE-21186 URL: https://issues.apache.org/jira/browse/HIVE-21186 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan REPL DUMP is fine. Load seems to be throwing exception: {code} 2019-01-29 09:25:12,671 ERROR HiveServer2-Background-Pool: Thread-4864: ql.Driver (SessionState.java:printError(1129)) - FAILED: Execution Error, return code 4 from org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask. java.lang.NullPointerException 2019-01-29 09:25:12,671 INFO HiveServer2-Background-Pool: Thread-4864: ql.Driver (Driver.java:execute(1661)) - task failed with org.apache.hadoop.hive.ql.parse.SemanticException: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.repl.bootstrap.load.table.LoadTable.tasks(LoadTable.java:154) at org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.executeBootStrapLoad(ReplLoadTask.java:141) at org.apache.hadoop.hive.ql.exec.repl.ReplLoadTask.execute(ReplLoadTask.java:82) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:177) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:93) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1777) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1511) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1308) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1175) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1170) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hive.service.cli.operation.SQLOperation$2.run(SQLOperation.java:273) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.hadoop.hive.ql.exec.repl.bootstrap.load.util.PathUtils.getExternalTmpPath(PathUtils.java:35) at org.apache.hadoop.hive.ql.exec.repl.bootstrap.load.table.LoadTable.loadTableTask(LoadTable.java:245) at org.apache.hadoop.hive.ql.exec.repl.bootstrap.load.table.LoadTable.newTableTasks(LoadTable.java:189) at org.apache.hadoop.hive.ql.exec.repl.bootstrap.load.table.LoadTable.tasks(LoadTable.java:136) ... 23 more {code} REPL Load statement: {code} REPL LOAD `testdb1_tgt` FROM 'hdfs://ctr-e139-1542663976389-56533-01-11.hwx.site:8020/apps/hive/repl/c9476207-8179-4db7-b947-ba67c950a340' WITH ('hive.query.id'='testHive1_3dd5e281-89ef-4054-850e-8a34386fc2c8','hive.exec.parallel'='true','hive.repl.replica.external.table.base.dir'='/tmp/someNewloc/','hive.repl.include.external.tables'='true','mapreduce.map.java.opts'='-Xmx640m','hive.distcp.privileged.doAs'='beacon','distcp.options.pugpb'='') {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20884) Support bootstrap of tables to target with hive.strict.managed.tables enabled.
Sankar Hariappan created HIVE-20884: --- Summary: Support bootstrap of tables to target with hive.strict.managed.tables enabled. Key: HIVE-20884 URL: https://issues.apache.org/jira/browse/HIVE-20884 Project: Hive Issue Type: Sub-task Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Hive2 supports replication of managed tables. But in Hive3, some of these managed tables are converted to ACID or MM tables. Also, some of them are converted to external tables based on below rules. # Avro format, Storage handlers, List bucketed tabled are converted to external tables. # Location not owned by "hive" user are converted to external table. # Hive owned ORC format are converted to full ACID transactional table. # Hive owned Non-ORC format are converted to MM transactional table. REPL LOAD should apply these rules during bootstrap and convert the tables accordingly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20882) Support Hive replication to a target cluster with hive.strict.managed.tables enabled.
Sankar Hariappan created HIVE-20882: --- Summary: Support Hive replication to a target cluster with hive.strict.managed.tables enabled. Key: HIVE-20882 URL: https://issues.apache.org/jira/browse/HIVE-20882 Project: Hive Issue Type: New Feature Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan *Requirements:* - Support Hive replication with Hive2 as master and Hive3 as slave where hive.strict.managed.tables is enabled. - The non-ACID managed tables from Hive2 should be converted to appropriate ACID or MM tables or to an external table based on Hive3 table type rules. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20883) REPL DUMP to dump the default warehouse directory of source.
Sankar Hariappan created HIVE-20883: --- Summary: REPL DUMP to dump the default warehouse directory of source. Key: HIVE-20883 URL: https://issues.apache.org/jira/browse/HIVE-20883 Project: Hive Issue Type: Sub-task Components: repl Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan The default warehouse directory of the source is needed by target to detect if DB or table location is set by user or assigned by Hive. Using this information, REPL LOAD will decide to preserve the path or move data to default managed table's warehouse directory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20762) NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small.
Sankar Hariappan created HIVE-20762: --- Summary: NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small. Key: HIVE-20762 URL: https://issues.apache.org/jira/browse/HIVE-20762 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan NOTIFICATION_LOG cleanup interval is hardcoded as 60s and is too small. It should be set to several hours or else the number of metastore calls would be too high and impact other operations. Make it configurable item and set it as 2Hrs. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20761) Select for update on notification sequence table has retry interval and retries count too small.
Sankar Hariappan created HIVE-20761: --- Summary: Select for update on notification sequence table has retry interval and retries count too small. Key: HIVE-20761 URL: https://issues.apache.org/jira/browse/HIVE-20761 Project: Hive Issue Type: Bug Components: HiveServer2, Metastore Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Hive DDL's are intermittently failing with Error- Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries {code:java} Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries. If this happens too often, then is recommended to increase the maximum number of retries on the hive.notification.sequence.lock.max.retries configuration :: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".) 2018-08-28 01:17:56,808|INFO|MainThread|machine.py:183 - run()||GUID=94e6ff4d-5db8-45eb-8654-76f546e7f0b3|java.sql.SQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Couldn't acquire the DB log notification lock because we reached the maximum # of retries: 5 retries. If this happens too often, then is recommended to increase the maximum number of retries on the hive.notification.sequence.lock.max.retries configuration :: Error executing SQL query "select "NEXT_EVENT_ID" from "NOTIFICATION_SEQUENCE" for update".){code} It seems, metastore operations are slow in this cluster and hence concurrent writes/DDL operations are failing to lock the row for update. Currently, the sleep interval between retries is specified via the config *hive.notification.sequence.lock.retry.sleep.interval*. The default value is 500 ms which seems to be too small. Can we set higher values for sleep interval and retries count, *hive.notification.sequence.lock.retry.sleep.interval=10s* *hive.notification.sequence.lock.max.retries=10* -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20682) Running sync query in master thread can potentially fail previous async query execution if the shared sessionHive object is closed.
Sankar Hariappan created HIVE-20682: --- Summary: Running sync query in master thread can potentially fail previous async query execution if the shared sessionHive object is closed. Key: HIVE-20682 URL: https://issues.apache.org/jira/browse/HIVE-20682 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.1.0, 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan *Problem description:* The master thread initializes the *sessionHive* object in *HiveSessionImpl* class when we open a new session for a client connection and by default all queries from this connection shares the same sessionHive object. If the master thread executes a *synchronous* query, it closes the sessionHive object (referred via thread local hiveDb) if {{Hive.isCompatible}} returns false and sets new Hive object in thread local HiveDb but doesn't change the sessionHive object in the session. Whereas, *asynchronous* query execution via async threads never closes the sessionHive object and it just creates a new one if needed and sets it as their thread local hiveDb. So, the problem can happen in the case where an *asynchronous* query is being executed by async threads refers to sessionHive object and the master thread receives a *synchronous* query that closes the same sessionHive object. Also, each query execution overwrites the thread local hiveDb object to sessionHive object which potentially leaks a metastore connection if the previous synchronous query execution re-created the Hive object. *Possible Fix:* We shall maintain an Atomic reference counter in the Hive object. We increment the counter when somebody sets it in thread local hiveDb and decrement it when somebody releases it. Only when the counter is down to 0, we should close the connection. Couple of cases to release the thread local hiveDb: * When synchronous query execution in master thread re-creates Hive object due to config change. We also, need to update the sessionHive object in the current session as we releases it from thread local hiveDb of master thread. * When async thread exits after completing execution or due to exception. If the session is getting closed, it is guaranteed that the reference counter is down to 0 as we forcefully close all the async operations and so we can close the connection there. cc [~pvary] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20646) Partition filter condition is not pushed down to metastore query if it has IS NOT NULL.
Sankar Hariappan created HIVE-20646: --- Summary: Partition filter condition is not pushed down to metastore query if it has IS NOT NULL. Key: HIVE-20646 URL: https://issues.apache.org/jira/browse/HIVE-20646 Project: Hive Issue Type: Improvement Components: HiveServer2, Standalone Metastore Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan If the partition filter condition has "is not null" then the filter query isn't getting pushed to the SQL query in RDMBS. This slows down metastore api calls for getting list of partitions with filter condition. This condition gets added by optimizer in many cases so this is affecting many queries. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20632) Query with get_splits UDF fails if materialized view is created on queried table.
Sankar Hariappan created HIVE-20632: --- Summary: Query with get_splits UDF fails if materialized view is created on queried table. Key: HIVE-20632 URL: https://issues.apache.org/jira/browse/HIVE-20632 Project: Hive Issue Type: Bug Components: HiveServer2, Materialized views, Standalone Metastore, UDF Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Scenario: # Create ACID table t1 and insert few rows. # Create materialized view mv as select a from t1 where a > 5; # Run get_split query "select get_splits( select a from t1 where a > 5); – This fails with AssertionError. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20627) Concurrent Async queries from same session intermittently fails with LockException.
Sankar Hariappan created HIVE-20627: --- Summary: Concurrent Async queries from same session intermittently fails with LockException. Key: HIVE-20627 URL: https://issues.apache.org/jira/browse/HIVE-20627 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan When multiple async queries are executed from same session, it leads to multiple async query execution DAGs share the same Hive object which is set by caller for all threads. In case of loading dynamic partitions, it creates MoveTask which re-creates the Hive object and closes the shared Hive object which causes metastore connection issues for other async execution thread who still access it. This is also seen if ReplDumpTask and ReplLoadTask are part of the DAG. *Root cause:* For Async query execution from SQLOperation.runInternal, we set the Thread local Hive object for all the child threads as parentHive (parentSession.getSessionHive()) {code} @Override public void run() { PrivilegedExceptionAction doAsAction = new PrivilegedExceptionAction() { @Override public Object run() throws HiveSQLException { Hive.set(parentHive); // Setting parentHive for all async operations. // TODO: can this result in cross-thread reuse of session state? SessionState.setCurrentSessionState(parentSessionState); PerfLogger.setPerfLogger(parentPerfLogger); LogUtils.registerLoggingContext(queryState.getConf()); try { if (asyncPrepare) { prepare(queryState); } runQuery(); } catch (HiveSQLException e) { // TODO: why do we invent our own error path op top of the one from Future.get? setOperationException(e); LOG.error("Error running hive query: ", e); } finally { LogUtils.unregisterLoggingContext(); } return null; } }; {code} Now, when async execution in progress and if one of the thread re-creates the Hive object, it closes the parentHive object first which impacts other threads using it and hence conf object it refers too gets cleaned up and hence we get null for VALID_TXNS_KEY value. {code} private static Hive create(HiveConf c, boolean needsRefresh, Hive db, boolean doRegisterAllFns) throws HiveException { if (db != null) { LOG.debug("Creating new db. db = " + db + ", needsRefresh = " + needsRefresh + ", db.isCurrentUserOwner = " + db.isCurrentUserOwner()); db.close(); } closeCurrent(); if (c == null) { c = createHiveConf(); } c.set("fs.scheme.class", "dfs"); Hive newdb = new Hive(c, doRegisterAllFns); hiveDB.set(newdb); return newdb; } {code} *Fix:* We shouldn't clean the old Hive object if it is shared by multiple threads. Shall use a flag to know this. *Memory leak issue:* Memory leak is found if one of the threads from Hive.loadDynamicPartitions throw exception. rawStoreMap is used to store rawStore objects which has to be cleaned. In this case, it is populated only in success flow but if there are exceptions, it is not and hence there is a leak. {code} futures.add(pool.submit(new Callable() { @Override public Void call() throws Exception { try { // move file would require session details (needCopy() invokes SessionState.get) SessionState.setCurrentSessionState(parentSession); LOG.info("New loading path = " + partPath + " with partSpec " + fullPartSpec); // load the partition Partition newPartition = loadPartition(partPath, tbl, fullPartSpec, loadFileType, true, false, numLB > 0, false, isAcid, hasFollowingStatsTask, writeId, stmtId, isInsertOverwrite); partitionsMap.put(fullPartSpec, newPartition); if (inPlaceEligible) { synchronized (ps) { InPlaceUpdate.rePositionCursor(ps); partitionsLoaded.incrementAndGet(); InPlaceUpdate.reprintLine(ps, "Loaded : " + partitionsLoaded.get() + "/" + partsToLoad + " partitions."); } } // Add embedded rawstore, so we can cleanup later to avoid memory leak if (getMSC().isLocalMetaStore()) { if (!rawStoreMap.containsKey(Thread.currentThread().getId())) { rawStoreMap.put(Thread.currentThread().getId(), HiveMetaStore.HMSHandler.getRawStore()); } } return null; } catch (Exception t) { } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20607) TxnHandler should use PreparedStatement to execute direct SQL queries.
Sankar Hariappan created HIVE-20607: --- Summary: TxnHandler should use PreparedStatement to execute direct SQL queries. Key: HIVE-20607 URL: https://issues.apache.org/jira/browse/HIVE-20607 Project: Hive Issue Type: Bug Components: Standalone Metastore, Transactions Affects Versions: 4.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 4.0.0 TxnHandler uses direct SQL queries to operate on Txn related databases/tables in Hive metastore RDBMS. Most of the methods are direct calls from Metastore api which should be directly append input string arguments to the SQL string. Need to use parameterised PreparedStatement object to set these arguments. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20543) Support replication of Materialized views
Sankar Hariappan created HIVE-20543: --- Summary: Support replication of Materialized views Key: HIVE-20543 URL: https://issues.apache.org/jira/browse/HIVE-20543 Project: Hive Issue Type: Sub-task Components: Materialized views, repl Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently materialized views are replicated but doesn't work if the DB is renamed after load. Also, it doesn't replicate ALTER MATERIALIZED VIEW [db_name.]materialized_view_name REBUILD; command so that MV remains stale and not in sync with source. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20542) Incremental REPL DUMP progress information log message is incorrect.
Sankar Hariappan created HIVE-20542: --- Summary: Incremental REPL DUMP progress information log message is incorrect. Key: HIVE-20542 URL: https://issues.apache.org/jira/browse/HIVE-20542 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Incremental REPL DUMP have the progress information logged as "eventsDumpProgress":"49/0". It should actually log the estimated number of events are denominator but it is coming as 0 always. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20541) REPL DUMP on external table with multiple partitions throw NoSuchElementException.
Sankar Hariappan created HIVE-20541: --- Summary: REPL DUMP on external table with multiple partitions throw NoSuchElementException. Key: HIVE-20541 URL: https://issues.apache.org/jira/browse/HIVE-20541 Project: Hive Issue Type: Bug Components: repl Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan REPL dump on an external table with multiple partitions throw NoSuchElementException. Need to check if file iterator list hasNext before accessing it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20476) REPL LOAD and EXPORT/IMPORT operations ignores distcp failures.
Sankar Hariappan created HIVE-20476: --- Summary: REPL LOAD and EXPORT/IMPORT operations ignores distcp failures. Key: HIVE-20476 URL: https://issues.apache.org/jira/browse/HIVE-20476 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 3.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan CopyUtils uses FileUtils.distCp to copy files but doesn't check the return value. It returns false if the copy fails. Now, REPL LOAD and EXPORT/IMPORT commands internally uses CopyUtils to copy data files across clusters and here it may return success even if file copy fails and may cause data loss. Need to throw error and retry. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20450) Add replication test for LOAD command on ACID table.
Sankar Hariappan created HIVE-20450: --- Summary: Add replication test for LOAD command on ACID table. Key: HIVE-20450 URL: https://issues.apache.org/jira/browse/HIVE-20450 Project: Hive Issue Type: Bug Components: repl Reporter: Sankar Hariappan Assignee: Sankar Hariappan Add replication test for LOAD command on ACID/MM table. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20371) Queries failing with Internal error processing add_write_notification_log
Sankar Hariappan created HIVE-20371: --- Summary: Queries failing with Internal error processing add_write_notification_log Key: HIVE-20371 URL: https://issues.apache.org/jira/browse/HIVE-20371 Project: Hive Issue Type: Bug Components: HiveServer2, repl, Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Queries failing with following error: {noformat} ERROR : FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing add_write_notification_log INFO : Completed executing command(queryId=hive_20180806072916_a9ae37a9-869f-4218-8357-a96ba713db69); Time taken: 878.604 seconds Error: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing add_write_notification_log (state=08S01,code=1) {noformat} >From hiveserver log: {noformat} 2018-08-06T07:59:33,656 ERROR [HiveServer2-Background-Pool: Thread-1551]: operation.Operation (:()) - Error running hive query: org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask. org.apache.thrift.TApplicationException: Internal error processing add_write_notification_log at org.apache.hive.service.cli.operation.Operation.toSQLException(Operation.java:335) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:226) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hive.service.cli.operation.SQLOperation.access$700(SQLOperation.java:87) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:316) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_112] at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_112] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) ~[hadoop-common-3.1.0.3.0.1.0-59.jar:?] at org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork.run(SQLOperation.java:329) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_112] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) ~[?:1.8.0_112] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ~[?:1.8.0_112] at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112] Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.thrift.TApplicationException: Internal error processing add_write_notification_log at org.apache.hadoop.hive.ql.metadata.Hive.addWriteNotificationLog(Hive.java:2879) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(Hive.java:2035) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.exec.MoveTask.handleStaticParts(MoveTask.java:477) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.exec.MoveTask.execute(MoveTask.java:397) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2679) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2350) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2026) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1724) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1718) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) ~[hive-exec-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:224) ~[hive-service-3.1.0.3.0.1.0-59.jar:3.1.0.3.0.1.0-59] ... 13 more Caused by:
[jira] [Created] (HIVE-20361) ReplDumpTaskTest is failing.
Sankar Hariappan created HIVE-20361: --- Summary: ReplDumpTaskTest is failing. Key: HIVE-20361 URL: https://issues.apache.org/jira/browse/HIVE-20361 Project: Hive Issue Type: Bug Components: repl, Test Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan ReplDumpTaskTest is failing but not running in ptest. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20316) Skip external table file listing for create table event.
Sankar Hariappan created HIVE-20316: --- Summary: Skip external table file listing for create table event. Key: HIVE-20316 URL: https://issues.apache.org/jira/browse/HIVE-20316 Project: Hive Issue Type: Bug Components: HCatalog, repl Affects Versions: 3.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan We are currently skipping external table replication. We shall also skip listing all the files in create table event generation for external tables (when external table replication is disabled). External tables might have very large number of files, so it would take long time to list them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20264) Bootstrap repl dump with concurrent write and drop of ACID table makes target inconsistent.
Sankar Hariappan created HIVE-20264: --- Summary: Bootstrap repl dump with concurrent write and drop of ACID table makes target inconsistent. Key: HIVE-20264 URL: https://issues.apache.org/jira/browse/HIVE-20264 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 4.0.0, 3.2.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan During bootstrap dump of ACID tables, let's consider the below sequence. - Get lastReplId = last event ID logged. - Current session (Thread-1), REPL DUMP -> Open txn (Txn1) - Event-10 - Another session (Thread-2), Open txn (Txn2) - Event-11 - Thread-2 -> Insert data (T1.D1) to ACID table. - Event-12 - Thread-2 -> Commit Txn (Txn2) - Event-13 - Thread-2 -> Drop table (T1) - Event-14 - Thread-1 -> Dump ACID tables based on validTxnList based on Txn1. --> This step skips all the data written by txns > Txn1. So, T1 will be missing. - Thread-1 -> Commit Txn (Txn1) - REPL LOAD from bootstrap dump will skip T1. - Incremental REPL DUMP will start from Event-10 and hence allocate write id for table T1 and drop table(T1) is idempotent. So, at target, exist entries in TXN_TO_WRITE_ID and NEXT_WRITE_ID metastore tables. - Now, when we create another table at source with same name T1 and replicate, then it may lead to incorrect data for readers at target on T1. Couple of proposals: 1. Make allocate write ID idempotent which is not possible as table doesn't exist and MM table import may lead to allocate write id before creating table. So, cannot differentiate these 2 cases. 2. Make Drop table event to drop entries from TXN_TO_WRITE_ID and NEXT_WRITE_ID tables irrespective of table exist or not at target. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20209) Metastore connection fail 1st attempt in repl dump
Sankar Hariappan created HIVE-20209: --- Summary: Metastore connection fail 1st attempt in repl dump Key: HIVE-20209 URL: https://issues.apache.org/jira/browse/HIVE-20209 Project: Hive Issue Type: Bug Reporter: Sankar Hariappan Assignee: Sankar Hariappan Run the following command: {code} repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', 'hive.repl.dump.include.acid.tables'='true'); {code} See this in hs2.log: {code} 2018-07-10T18:07:32,308 INFO [HiveServer2-Handler-Pool: Thread-14380]: conf.HiveConf (HiveConf.java:getLogIdVar(5061)) - Using the default value passed in for log id: f1e13736-3f10-4abf-a29b-683b534dfa4c 2018-07-10T18:07:32,309 INFO [HiveServer2-Handler-Pool: Thread-14380]: session.SessionState (:()) - Updating thread name to f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380 2018-07-10T18:07:32,311 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: operation.OperationManager (:()) - Adding operation: OperationHandle [opType=EXECUTE_STATEMENT, getHandleIdentifier()=16eb1d07-e125-490c-8ab8-90192bfd459b] 2018-07-10T18:07:32,314 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Compiling command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b): repl dump `*` from 60758 with ('hive.repl.dump.metadata.only'='true', 'hive.repl.dump.include.acid.tables'='true') 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) - Trying to connect to metastore with URI thrift://hwx-demo-2.field.hortonworks.com:9083 2018-07-10T18:07:32,317 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) - Opened a connection to metastore, current connections: 19 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: metastore.HiveMetaStoreClient (:()) - Connected to metastore. 2018-07-10T18:07:32,319 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: metastore.RetryingMetaStoreClient (:()) - RetryingMetaStoreClient proxy=class org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient ugi=hive (auth:SIMPLE) retries=24 delay=5 lifetime=0 2018-07-10T18:07:32,439 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Semantic Analysis Completed (retrial = false) 2018-07-10T18:07:32,440 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:dump_dir, type:string, comment:from deserializer), FieldSchema(name:last_repl_id, type:string, comment:from deserializer)], properties:null) 2018-07-10T18:07:32,443 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: exec.ListSinkOperator (:()) - Initializing operator LIST_SINK[0] 2018-07-10T18:07:32,446 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: ql.Driver (:()) - Completed compiling command(queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b); Time taken: 0.132 seconds 2018-07-10T18:07:32,447 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: conf.HiveConf (HiveConf.java:getLogIdVar(5061)) - Using the default value passed in for log id: f1e13736-3f10-4abf-a29b-683b534dfa4c 2018-07-10T18:07:32,448 INFO [f1e13736-3f10-4abf-a29b-683b534dfa4c HiveServer2-Handler-Pool: Thread-14380]: session.SessionState (:()) - Resetting thread name to HiveServer2-Handler-Pool: Thread-14380 2018-07-10T18:07:32,451 INFO [HiveServer2-Background-Pool: Thread-15161]: reexec.ReExecDriver (:()) - Execution #1 of query 2018-07-10T18:07:32,452 INFO [HiveServer2-Background-Pool: Thread-15161]: lockmgr.DbTxnManager (:()) - Setting lock request transaction to txnid:30327 for queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b 2018-07-10T18:07:32,454 INFO [HiveServer2-Background-Pool: Thread-15161]: lockmgr.DbLockManager (:()) - Requesting: queryId=hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b LockRequest(component:[LockComponent(type:SHARED_READ, level:DB, dbname:default, operationType:SELECT), LockComponent(type:SHARED_READ, level:DB, dbname:hwxdemo, operationType:SELECT), LockComponent(type:SHARED_READ, level:DB, dbname:information_schema, operationType:SELECT), LockComponent(type:SHARED_READ, level:DB, dbname:sys, operationType:SELECT)], txnid:30327, user:hive, hostname:hwx-demo-2.field.hortonworks.com, agentInfo:hive_20180710180732_7dcc20db-90db-486d-a825-e6fa91dc092b) 2018-07-10T18:07:32,497 INFO [HiveServer2-Background-Pool: Thread-15161]:
[jira] [Created] (HIVE-20192) HS2 is leaking JDOPersistenceManager objects.
Sankar Hariappan created HIVE-20192: --- Summary: HS2 is leaking JDOPersistenceManager objects. Key: HIVE-20192 URL: https://issues.apache.org/jira/browse/HIVE-20192 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.0.0, 3.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Hiveserver2 instances where every 3-4 days they are observing HS2 in an unresponsive state, we observed that the FGC collection happening regularly >From JXray report it is seen that pmCache(List of JDOPersistenceManager >objects) is occupying 84% of the heap and there are around 16,000 references >of UDFClassLoader. When the RawStore object is re-created, it is not allowed to be updated into the ThreadWithGarbageCleanup.threadRawStoreMap which leads to the new RawStore never gets cleaned-up when the thread exit. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20025) Clean-up of event files created by HiveProtoLoggingHook.
Sankar Hariappan created HIVE-20025: --- Summary: Clean-up of event files created by HiveProtoLoggingHook. Key: HIVE-20025 URL: https://issues.apache.org/jira/browse/HIVE-20025 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 4.0.0 Currently, HiveProtoLoggingHook write event data to hdfs. The number of files can grow to very large numbers. Since the files are created under a folder with Date being a part of the path, hive should have a way to clean up data older than a certain configured time / date. This can be a job that can run with as little frequency as just once a day. This time should be set to 1 week default. There should also be a sane upper bound of # of files so that when a large cluster generates a lot of files during a spike, we don't force the cluster fall over. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19927) Last Repl ID set by bootstrap dump is not proper and may cause loss of data if have ACID tables.
Sankar Hariappan created HIVE-19927: --- Summary: Last Repl ID set by bootstrap dump is not proper and may cause loss of data if have ACID tables. Key: HIVE-19927 URL: https://issues.apache.org/jira/browse/HIVE-19927 Project: Hive Issue Type: Sub-task Components: HiveServer2, Transactions Affects Versions: 3.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan During bootstrap dump of ACID tables, let's consider the below sequence. - Current session (REPL DUMP), Open txn (Txn1) - Event-10 - Another session (Session-2), Open txn (Txn2) - Event-11 - Session-2 -> Insert data (T1.D1) to ACID table. - Event-12 - Get lastReplId = last event ID logged. (Event-12) - Session-2 -> Commit Txn (Txn2) - Event-13 - Dump ACID tables based on validTxnList based on Txn1. --> This step skips all the data written by txns > Txn1. So, T1.D1 will be missing. - Commit Txn (Txn1) - REPL LOAD from bootstrap dump will skip T1.D1. - Incremental REPL DUMP will start from Event-13 and hence lose Txn2 which is opened after Txn1. So, data T1.D1 will be lost for ever. Proposed to capture the lastReplId of bootstrap before opening current txn (Txn1) and store it in Driver context and use it for dump. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19815) Repl dump should not propagate the checkpoint and repl source properties
Sankar Hariappan created HIVE-19815: --- Summary: Repl dump should not propagate the checkpoint and repl source properties Key: HIVE-19815 URL: https://issues.apache.org/jira/browse/HIVE-19815 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.1.0, 4.0.0 For replication scenarios of A-> B -> C the repl dump on B should not include the checkpoint property when dumping out table information. Alter tables/partitions during incremental should not propagate this as well. Also should not propagate the the db level parameters set by replication internally. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19739) Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded DB/tables/partition objects.
Sankar Hariappan created HIVE-19739: --- Summary: Bootstrap REPL LOAD to use checkpoints to validate and skip the loaded DB/tables/partition objects. Key: HIVE-19739 URL: https://issues.apache.org/jira/browse/HIVE-19739 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 4.0.0 Currently. bootstrap REPL LOAD expect the target database to be empty or not exist to start bootstrap load. But, this adds overhead when there is a failure in between bootstrap load and there is no way to resume it from where it fails. So, it is needed to create checkpoints in table/partitions to skip the completely loaded objects. Use the fully qualified path of the dump directory as a checkpoint identifier. This should be added to the table / partition properties in hive via a task, as the last task in the DAG for table / partition creation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19499) Bootstrap REPL LOAD shall add tasks to create checkpoints for tables/partitions.
Sankar Hariappan created HIVE-19499: --- Summary: Bootstrap REPL LOAD shall add tasks to create checkpoints for tables/partitions. Key: HIVE-19499 URL: https://issues.apache.org/jira/browse/HIVE-19499 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.1.0 Currently. bootstrap REPL LOAD expect the target database to be empty or not exist to start bootstrap load. But, this adds overhead when there is a failure in between bootstrap load and there is no way to resume it from where it fails. So, it is needed to create checkpoints in table/partitions to skip the completely loaded objects. Use hash of the fully qualified path of the dump directory as a checkpoint identifier. This should be added to the table / partition properties in hive via a task, as the last task in the DAG for table / partition creation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19476) Fix failures in TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and TestCopyUtils
Sankar Hariappan created HIVE-19476: --- Summary: Fix failures in TestReplicationScenariosAcidTables, TestReplicationOnHDFSEncryptedZones and TestCopyUtils Key: HIVE-19476 URL: https://issues.apache.org/jira/browse/HIVE-19476 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0, 3.1.0 If the incremental dump have drop of partitioned table followed by create/insert on non-partitioned table with same name, doesn't replicate the data. Explained below. Let's say we have a partitioned table T1 which was already replicated to target. DROP_TABLE(T1)->CREATE_TABLE(T1) (Non-partitioned) -> INSERT(T1)(10) After REPL LOAD, T1 doesn't have any data. Same is valid for non-partitioned to partitioned and partition spec mismatch case as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19435) Data lost when Incremental REPL LOAD with Drop partitioned table followed by create/insert non-partitioned table with same name.
Sankar Hariappan created HIVE-19435: --- Summary: Data lost when Incremental REPL LOAD with Drop partitioned table followed by create/insert non-partitioned table with same name. Key: HIVE-19435 URL: https://issues.apache.org/jira/browse/HIVE-19435 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0, 3.1.0 Hive replication uses Hadoop distcp to copy files from primary to replica warehouse. If the HDFS block size is different across clusters, it cause file copy failures. {code:java} 2018-04-09 14:32:06,690 ERROR [main] org.apache.hadoop.tools.mapred.CopyMapper: Failure in copying hdfs://chelsea/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/000259_0 to hdfs://marilyn/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/.hive-staging_hive_2018-04-09_14-30-45_723_7153496419225102220-2/-ext-10001/000259_0 java.io.IOException: File copy failed: hdfs://chelsea/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/000259_0 --> hdfs://marilyn/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/.hive-staging_hive_2018-04-09_14-30-45_723_7153496419225102220-2/-ext-10001/000259_0 at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:299) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:266) at org.apache.hadoop.tools.mapred.CopyMapper.map(CopyMapper.java:52) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:170) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:164) Caused by: java.io.IOException: Couldn't run retriable-command: Copying hdfs://chelsea/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/000259_0 to hdfs://marilyn/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/.hive-staging_hive_2018-04-09_14-30-45_723_7153496419225102220-2/-ext-10001/000259_0 at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:101) at org.apache.hadoop.tools.mapred.CopyMapper.copyFileWithRetry(CopyMapper.java:296) ... 10 more Caused by: java.io.IOException: Check-sum mismatch between hdfs://chelsea/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/000259_0 and hdfs://marilyn/apps/hive/warehouse/tpch_flat_orc_1000.db/customer/.hive-staging_hive_2018-04-09_14-30-45_723_7153496419225102220-2/-ext-10001/.distcp.tmp.attempt_1522833620762_4416_m_00_0. Source and target differ in block-size. Use -pb to preserve block-sizes during copy. Alternatively, skip checksum-checks altogether, using -skipCrc. (NOTE: By skipping checksums, one runs the risk of masking data-corruption during file-transfer.) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.compareCheckSums(RetriableFileCopyCommand.java:212) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doCopy(RetriableFileCopyCommand.java:130) at org.apache.hadoop.tools.mapred.RetriableFileCopyCommand.doExecute(RetriableFileCopyCommand.java:99) at org.apache.hadoop.tools.util.RetriableCommand.execute(RetriableCommand.java:87) ... 11 more {code} Distcp failed as the CM path for the file doesn't point to source file system. So, it is needed to get the qualified cm root URI as part of files listed in dump. Also, REPL LOAD returns success even if distcp jobs failed. CopyUtils.doCopyRetry doesn't throw error if copy failed even after maximum attempts. So, need to perform 2 things. # If copy of multiple files fail for some reason, then retry with same set of files again but need to set CM path if original source file is missing or modified based on checksum. Let distcp to skip the properly copied files. FileUtil.copy will always overwrite the files. # If source path is moved to CM path, then delete the incorrectly copied files. # If copy fails for maximum attempt, then throw error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19248) Hive replication cause file copy failures if HDFS block size differs across clusters
Sankar Hariappan created HIVE-19248: --- Summary: Hive replication cause file copy failures if HDFS block size differs across clusters Key: HIVE-19248 URL: https://issues.apache.org/jira/browse/HIVE-19248 Project: Hive Issue Type: Bug Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.1.0 This is the case where the events were deleted on source because of old event purging and hence min(source event id) > target event id (last replicated event id). Repl dump should fail in this case so that user can drop the database and bootstrap again. Cleaner thread is concurrently removing the expired events from NOTIFICATION_LOG table. So, it is necessary to check if the current dump missed any event while dumping. After fetching events in batches, we shall check if it is fetched in contiguous sequence of event id. If it is not in contiguous sequence, then likely some events missed in the dump and hence throw error. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19219) Hive replicated database is out of sync if events are cleaned-up.
Sankar Hariappan created HIVE-19219: --- Summary: Hive replicated database is out of sync if events are cleaned-up. Key: HIVE-19219 URL: https://issues.apache.org/jira/browse/HIVE-19219 Project: Hive Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.1.0 This is the case where the events were deleted on source because of old event purging and hence min(source event id) > target event id (last replicated event id). Repl dump should fail in this case so that user can drop the database and bootstrap again. The next incremental repl dump could check if the last completed event (passed as the fromEventId argument), is still present in source notification_log table. If it is not present, it should error out saying that events are missing. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19130) NPE thrown when applied drop partition event on target.
Sankar Hariappan created HIVE-19130: --- Summary: NPE thrown when applied drop partition event on target. Key: HIVE-19130 URL: https://issues.apache.org/jira/browse/HIVE-19130 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 During incremental replication, if we split the events batch as follows, then the REPL LOAD on second batch throws NPE. Batch-1: CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> DROP_PARTITION (t1.p1) Batch-2: DROP_TABLE(t1) -> CREATE_TABLE(t1) -> ADD_PARTITION(t1.p1) -> DROP_PARTITION (t1.p1) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19007) REPL LOAD should set the Hive configs obtained through WITH clause into the tasks created..
Sankar Hariappan created HIVE-19007: --- Summary: REPL LOAD should set the Hive configs obtained through WITH clause into the tasks created.. Key: HIVE-19007 URL: https://issues.apache.org/jira/browse/HIVE-19007 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 The configs received from WITH clause of REPL LOAD are not set properly (due to changes by HIVE-18716) to the tasks created. It is also required to re-get the Hive db object if the configs are changed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18988) Support bootstrap replication of ACID tables
Sankar Hariappan created HIVE-18988: --- Summary: Support bootstrap replication of ACID tables Key: HIVE-18988 URL: https://issues.apache.org/jira/browse/HIVE-18988 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Bootstrapping of ACID tables, need special handling to replicate a stable state of data. - If ACID feature enables, then perform bootstrap dump for ACID tables with in read txn. -> Dump table/partition metadata. -> Get the list of valid data files for a table using same logic as read txn do. -> Dump latest valid table Write ID as per current read txn. - Find the valid last replication state such that it points to event ID of open_txn event of oldest on-going txn. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18923) ValidWriteIdList snapshot per table can be cached for multi-statement transactions.
Sankar Hariappan created HIVE-18923: --- Summary: ValidWriteIdList snapshot per table can be cached for multi-statement transactions. Key: HIVE-18923 URL: https://issues.apache.org/jira/browse/HIVE-18923 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, for each query within a multi-statement transaction, it would request metastore/TxnHandler to build ValidWriteIdList snapshot for the given table. But, the snapshot won't change within the duration of transaction. So, it make sense to cache it within QueryTxnManager. However, each txn should be able to view their own written rows. So, when a transaction allocates writeId to write on a table, then the cached ValidWriteIdList on this table should be recalculated as follows. Original ValidWriteIdList: \{hwm=10, open/aborted=5,6} – (10 is allocated by txn < current txn_id). Allocated writeId for this txn: 13 – (11 and 12 are taken by some other txn > current txn_id) New ValidWriteIdList: \{hwm=12, open/aborted=5,6,11, 12} – (11, 12 are added to invalid list, so the snapshot remains same). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18864) WriteId high water mark (HWM) is incorrect if ValidWriteIdList is obtained after allocating writeId by current transaction.
Sankar Hariappan created HIVE-18864: --- Summary: WriteId high water mark (HWM) is incorrect if ValidWriteIdList is obtained after allocating writeId by current transaction. Key: HIVE-18864 URL: https://issues.apache.org/jira/browse/HIVE-18864 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 For multi-statement txns, it is possible that write on a table happens after a read. Let's see the below scenario. # Committed txn=9 writes on table T1 with writeId=5. # Open txn=10. ValidTxnList(open:null, txn_HWM=10), # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5). # Open txn=11, writes on table T1 with writeid=6. # Read table T1 from txn=10. ValidWriteIdList(open:null, write_HWM=5). # Write table T1 from txn=10 with writeId=7. # Read table T1 from txn=10. {color:#d04437}*ValidWriteIdList(open:null, write_HWM=7)*. – This read will able to see rows added by txn=11 which is still open.{color} {color:#d04437}{color:#33}So, it is needed to rebuild the open/aborted list of ValidWriteIdList based on txn_HWM. Any writeId allocated by txnId > txn_HWM should be marked as open.{color} {color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18824) ValidWriteIdList config should be defined on tables which has to collect stats after insert.
Sankar Hariappan created HIVE-18824: --- Summary: ValidWriteIdList config should be defined on tables which has to collect stats after insert. Key: HIVE-18824 URL: https://issues.apache.org/jira/browse/HIVE-18824 Project: Hive Issue Type: Sub-task Components: HiveServer2, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 In HIVE-18192 , per table write ID was introduced where snapshot isolation is built using ValidWriteIdList on tables which are read with in a txn. ReadEntity list is referred to decide which table is read within a txn. For insert operation, table will be found only in WriteEntity, but the table is read to collect stats. So, it is needed to build the ValidWriteIdList for tables/partition part of WriteEntity as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18753) Correct methods and variables names which uses writeId instead of transactionId.
Sankar Hariappan created HIVE-18753: --- Summary: Correct methods and variables names which uses writeId instead of transactionId. Key: HIVE-18753 URL: https://issues.apache.org/jira/browse/HIVE-18753 Project: Hive Issue Type: Sub-task Components: HiveServer2, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID implementation (HIVE-18192) have replaced global transaction ID with per table write ID to version the data files in a table. Now, the ACID data files referred by different class methods and variables still uses names transactionId to mean write id. So, it is required to rename those methods/variables/classes to mean writeId instead of transactionId. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18752) HiveEndPoint: Optimise opening batch transactions and getting write Ids for each transaction in the batch into single metastore api.
Sankar Hariappan created HIVE-18752: --- Summary: HiveEndPoint: Optimise opening batch transactions and getting write Ids for each transaction in the batch into single metastore api. Key: HIVE-18752 URL: https://issues.apache.org/jira/browse/HIVE-18752 Project: Hive Issue Type: Sub-task Components: HiveServer2, Metastore Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID implementation (HIVE-18192) have introduced write ID and maps it against the txn. Now, for streaming ingest, we need to open txns batch and then allocate write id for each txn in the batch which is 2 metastore calls. This can be optimised to use only one metastore api. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18751) get_splits UDF on ACID table scan doesn't receive ValidWriteIdList configuration.
Sankar Hariappan created HIVE-18751: --- Summary: get_splits UDF on ACID table scan doesn't receive ValidWriteIdList configuration. Key: HIVE-18751 URL: https://issues.apache.org/jira/browse/HIVE-18751 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID (HIVE-18192) have replaced global transaction ID with write ID to version data files in ACID/MM tables, To ensure snapshot isolation, need to generate ValidWriteIdList for the given txn/table and use it when scan the ACID/MM tables. In case of get_splits UDF which runs on ACID table scan query won't receive it properly through configuration and hence throws exception. TestAcidOnTez.testGetSplitsLocks is the test failing for the same. Need to fix it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18750) Exchange partition should not be supported with per table write ID.
Sankar Hariappan created HIVE-18750: --- Summary: Exchange partition should not be supported with per table write ID. Key: HIVE-18750 URL: https://issues.apache.org/jira/browse/HIVE-18750 Project: Hive Issue Type: Sub-task Components: HiveServer2, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write id implementation (HIVE-18192) have introduced write ID per table and used write ID to name the delta/base files and also as primary key for each row. Now, exchange partition have to move delta/base files across tables without changing the write ID which causes incorrect results. Also, this exchange partition feature is there to support the use-case of atomic updates. But with ACID updates, we shall support atomic-updates and hence it makes sense to not support exchange partition for ACID and MM tables. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18749) Need to replace transactionId to per table writeId in RecordIdentifier.Field.transactionId
Sankar Hariappan created HIVE-18749: --- Summary: Need to replace transactionId to per table writeId in RecordIdentifier.Field.transactionId Key: HIVE-18749 URL: https://issues.apache.org/jira/browse/HIVE-18749 Project: Hive Issue Type: Sub-task Components: Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID implementation (HIVE-18192) have replaced global transaction ID with write ID for the primary key for a row marked by RecordIdentifier.Field..transactionId. Need to replace the same with writeId and update all test results file. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18748) Rename tables should update the table names in NEXT_WRITE_ID and TXN_TO_WRITE_ID tables.
Sankar Hariappan created HIVE-18748: --- Summary: Rename tables should update the table names in NEXT_WRITE_ID and TXN_TO_WRITE_ID tables. Key: HIVE-18748 URL: https://issues.apache.org/jira/browse/HIVE-18748 Project: Hive Issue Type: Sub-task Components: HiveServer2, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID implementation (HIVE-18192) introduces couple of metatables such as NEXT_WRITE_ID and TXN_TO_WRITE_ID to manage write ids allocated per table. Now, when we rename any tables, it is necessary to update the corresponding table names in these table as well. Otherwise, ACID table operations won't work properly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18747) Cleaner for TXNS_TO_WRITE_ID table entries.
Sankar Hariappan created HIVE-18747: --- Summary: Cleaner for TXNS_TO_WRITE_ID table entries. Key: HIVE-18747 URL: https://issues.apache.org/jira/browse/HIVE-18747 Project: Hive Issue Type: Sub-task Components: repl, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Per table write ID implementation (HIVE-18192) maintains a map between txn ID and table write ID in TXN_TO_WRITE_ID meta table. The entries in this table is used to generate ValidWriteIdList for the given ValidTxnList to ensure snapshot isolation. When table or database is dropped, then these entries are cleaned-up. But, it is necessary to clean-up for active tables too for better performance. Need to have another table MIN_HISTORY_LEVEL to maintain the least txn which is referred by any active ValidTxnList snapshot as open/aborted txn. If no references found in this table for any txn, then it is eligible for cleanup. After clean-up, need to maintain just one entry per table to mark as LWM (low water mark). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18320) Support ACID Tables replication
Sankar Hariappan created HIVE-18320: --- Summary: Support ACID Tables replication Key: HIVE-18320 URL: https://issues.apache.org/jira/browse/HIVE-18320 Project: Hive Issue Type: New Feature Components: HiveServer2, Metastore, repl, Transactions Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, Full ACID and MM (Micro-Managed) tables are not supported by Replv2. Need to support it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18031) Support replication for Alter Database operation.
Sankar Hariappan created HIVE-18031: --- Summary: Support replication for Alter Database operation. Key: HIVE-18031 URL: https://issues.apache.org/jira/browse/HIVE-18031 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently alter database operations to alter the database properties or description are not generating any events due to which it is not getting replicated. Need to add an event for this. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17887) Incremental REPL LOAD with Drop partition event on timestamp type partition column fails.
Sankar Hariappan created HIVE-17887: --- Summary: Incremental REPL LOAD with Drop partition event on timestamp type partition column fails. Key: HIVE-17887 URL: https://issues.apache.org/jira/browse/HIVE-17887 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 When try to replicate the drop partition event on a table with partition on timestamp type column fails in REPL LOAD. *Scenario:* 1. create table with partition on timestamp column. 2.bootstrap dump/load. 3. insert a record to create partition(p="2001-11-09 00:00:00.0"). 4. drop the same partition(p="2001-11-09 00:00:00.0"). 5. incremental dump/load -- REPL LOAD throws below exception {quote}2017-10-23 12:26:14,050 ERROR [HiveServer2-Background-Pool: Thread-36769]: metastore.RetryingHMSHandler (RetryingHMSHandler.java:invokeInternal(203)) - MetaException(message:Error parsing partition filter; lexer error: line 1:18 no viable alternative at character ':'; exception MismatchedTokenException(12!=23)) at org.apache.hadoop.hive.metastore.ObjectStore.getFilterParser(ObjectStore.java:2759) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:2708) at org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:2517) at sun.reflect.GeneratedMethodAccessor362.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:103) at com.sun.proxy.$Proxy18.getPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:4957) at sun.reflect.GeneratedMethodAccessor361.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy21.get_partitions_by_filter(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitionsByFilter(HiveMetaStoreClient.java:1200) at sun.reflect.GeneratedMethodAccessor360.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:178) at com.sun.proxy.$Proxy22.listPartitionsByFilter(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getPartitionsByFilter(Hive.java:2562) at org.apache.hadoop.hive.ql.exec.DDLTask.dropPartitions(DDLTask.java:4018) at org.apache.hadoop.hive.ql.exec.DDLTask.dropTableOrPartitions(DDLTask.java:3993) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:343) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:162) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:89) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1751) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1497) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1294) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1156) at org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:197) at org.apache.hive.service.cli.operation.SQLOperation.access$300(SQLOperation.java:76) at org.apache.hive.service.cli.operation.SQLOperation$2$1.run(SQLOperation.java:255) {quote} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17757) REPL LOAD should use customised configurations to execute distcp/remote copy.
Sankar Hariappan created HIVE-17757: --- Summary: REPL LOAD should use customised configurations to execute distcp/remote copy. Key: HIVE-17757 URL: https://issues.apache.org/jira/browse/HIVE-17757 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 As REPL LOAD command needs to read repl dump directory and data files from source cluster, it needs to use some of the configurations to read data securely through distcp. Some of the HDFS configurations cannot be added to whitelist as they pose security threat. So, it is necessary for REPL LOAD command to take such configs as input and use it when trigger distcp. *Proposed syntax:* REPL LOAD [.] FROM [*WITH (key1=value1, key2=value2)*]; -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17728) TestHCatClient should configure hive.metastore.transactional.event.listeners as per recommendation.
Sankar Hariappan created HIVE-17728: --- Summary: TestHCatClient should configure hive.metastore.transactional.event.listeners as per recommendation. Key: HIVE-17728 URL: https://issues.apache.org/jira/browse/HIVE-17728 Project: Hive Issue Type: Sub-task Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, TestHCatClient.java uses hive.metastore.event.listeners to enable notification events logging. But the recommended configuration for the same is hive.metastore.transactional.event.listeners. So, need to update the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17681) Need to log "Bootstrap dump progress state" property to HS2 logs.
Sankar Hariappan created HIVE-17681: --- Summary: Need to log "Bootstrap dump progress state" property to HS2 logs. Key: HIVE-17681 URL: https://issues.apache.org/jira/browse/HIVE-17681 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, rename is disabled if bootstrap dump in progress. This is achieved by setting a "ACTIVE" flag in database properties which is referred by rename operation. If HiveServer2 crashes when dump in progress, then this flag is not unset (to IDLE state) which make the rename operation disabled for ever. User need to manually enable rename in this scenario. So, need to log the property name associated with bootstrap dump which was in progress when HS2 crashes. Using this, user will reset this property of the database to enable rename again. Also, need document update for the same. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17608) REPL LOAD should overwrite the data files if exists instead of duplicating it
Sankar Hariappan created HIVE-17608: --- Summary: REPL LOAD should overwrite the data files if exists instead of duplicating it Key: HIVE-17608 URL: https://issues.apache.org/jira/browse/HIVE-17608 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 This is to make insert event idempotent. Currently, MoveTask would create a new file if the destination folder contains a file of the same name. This is wrong if we have the same file in both bootstrap dump and incremental dump (by design, duplicate file in incremental dump will be ignored for idempotent reason), we will get duplicate files eventually. Also it is wrong to just retain the filename in the staging folder. Suppose we get the same insert event twice, the first time we get the file from source table folder, the second time we get the file from cm, we still end up with duplicate copy. The right solution is to keep the same file name as the source table folder. To do that, we can put the original filename in MoveWork, and in MoveTask, if original filename is set, don't generate a new name, simply overwrite. We need to do it in both bootstrap and incremental load. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17527) Support replication for rename/move table across database
Sankar Hariappan created HIVE-17527: --- Summary: Support replication for rename/move table across database Key: HIVE-17527 URL: https://issues.apache.org/jira/browse/HIVE-17527 Project: Hive Issue Type: Sub-task Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Rename/move table across database should be supported for replication. The scenario is as follows. 1. Create 2 databases (db1 and db2) in source cluster. 2. Create the table db1.tbl1. 3. Run bootstrap replication for db1 and db2 to target cluster. 4. Rename db1.tbl1 to db2.tbl1 in source. 5. Run incremental replication for both db1 and db2. - db1 dump fails telling rename across databases is not supported. - db2 dump missed the table as no event is generated when moved to db2. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17494) Bootstrap REPL DUMP throws exception if a partitioned table is dropped while reading partitions.
Sankar Hariappan created HIVE-17494: --- Summary: Bootstrap REPL DUMP throws exception if a partitioned table is dropped while reading partitions. Key: HIVE-17494 URL: https://issues.apache.org/jira/browse/HIVE-17494 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 When a table is dropped between fetching table and fetching partitions, then bootstrap dump throws exception. 1. Fetch table names. 2. Get table 3. Dump table object 4. Drop table from another thread. 5. Fetch partitions (throws exception from fireReadTablePreEvent) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17428) REPL LOAD of ALTER_PARTITION event doesn't create import tasks if the partition doesn't exist during analyze phase.
Sankar Hariappan created HIVE-17428: --- Summary: REPL LOAD of ALTER_PARTITION event doesn't create import tasks if the partition doesn't exist during analyze phase. Key: HIVE-17428 URL: https://issues.apache.org/jira/browse/HIVE-17428 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 If the incremental dump event sequence have ADD_PARTITION followed by ALTER_PARTITION doesn't create any task for ALTER_PARTITION event as the partition doesn't exist during analyze phase. Due to this REPL STATUS returns wrong last repl ID. Scenario: 1. Create DB 2. Create partitioned table. 3. Bootstrap dump and load 4. Insert into table to a dynamically created partition. - This insert generate ADD_PARTITION and ALTER_PARTITION events. 5. Incremental dump and load. - Load will be successful. - But the last repl ID set was incorrect as ALTER_PARTITION event was never applied. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17367) IMPORT should overwrite the table if the dump has same state as table.
Sankar Hariappan created HIVE-17367: --- Summary: IMPORT should overwrite the table if the dump has same state as table. Key: HIVE-17367 URL: https://issues.apache.org/jira/browse/HIVE-17367 Project: Hive Issue Type: Bug Components: HiveServer2, Import/Export, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Repl v1 creates a set of EXPORT/IMPORT commands to replicate modified data (as per events) across clusters. For instance, let's say, insert generates 2 events such as ALTER_TABLE (ID: 10) INSERT (ID: 11) Each event generates a set of EXPORT and IMPORT commands. ALTER_TABLE event generates metadata only export/import INSERT generates metadata+data export/import. As Hive always dump the latest copy of table during export, it sets the latest notification event ID as current state of it. So, in this example, import of metadata by ALTER_TABLE event sets the current state of the table as 11. Now, when we try to import the data dumped by INSERT event, it is noop as the table's current state(11) is equal to the dump state (11) which in-turn leads to the data never gets replicated to target cluster. So, it is necessary to allow overwrite of table/partition if their current state equals the dump state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17289) IMPORT should copy files without using doAs user.
Sankar Hariappan created HIVE-17289: --- Summary: IMPORT should copy files without using doAs user. Key: HIVE-17289 URL: https://issues.apache.org/jira/browse/HIVE-17289 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 3.0.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, IMPORT uses distcp to copy the larger files/large number of files from dump directory to table staging directory. But, this copy fails as distcp is always done with doAs user specified in hive.distcp.privileged.doAs, which is "hdfs' by default. Need to remove usage of doAs user when try to distcp from IMPORT flow. Also, need to set the default config for hive.distcp.privileged.doAs to "hive" as "hdfs" super-user is never allowed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17212) Dynamic add partition by insert shouldn't generate INSERT event.
Sankar Hariappan created HIVE-17212: --- Summary: Dynamic add partition by insert shouldn't generate INSERT event. Key: HIVE-17212 URL: https://issues.apache.org/jira/browse/HIVE-17212 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 A partition is dynamically added if INSERT INTO is invoked on a non-existing partition. In this case, Hive should generate only ADD_PARTITION events with the new files added. It shouldn't create INSERT event. Need to test and verify this behaviour. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17196) CM: ReplCopyTask should retain the original file names even if copied from CM path.
Sankar Hariappan created HIVE-17196: --- Summary: CM: ReplCopyTask should retain the original file names even if copied from CM path. Key: HIVE-17196 URL: https://issues.apache.org/jira/browse/HIVE-17196 Project: Hive Issue Type: Sub-task Components: repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Consider the below scenario, 1. Insert into table T1 with value(X). 2. Insert into table T1 with value(X). 3. Truncate the table T1. – This step backs up 2 files with same content to cmroot which ends up with one file in cmroot as checksum matches. 4. Incremental repl with above 3 operations. – In this step, both the insert event files will be read from cmroot where copy of one leads to overwrite the other one as the file name is same in cm path (checksum as file name). So, this leads to data loss and hence it is necessary to retain the original file names even if we copy from cm path. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17195) Long chain of tasks created by REPL LOAD shouldn't cause stack corruption.
Sankar Hariappan created HIVE-17195: --- Summary: Long chain of tasks created by REPL LOAD shouldn't cause stack corruption. Key: HIVE-17195 URL: https://issues.apache.org/jira/browse/HIVE-17195 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, long chain REPL LOAD tasks lead to huge recursive calls when try to traverse the DAG. For example, getMRTasks, getTezTasks, getSparkTasks and iterateTasks methods run recursively to traverse the DAG. Need to modify this traversal logic to reduce stack usage. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17183) Disable rename operations during bootstrap dump
Sankar Hariappan created HIVE-17183: --- Summary: Disable rename operations during bootstrap dump Key: HIVE-17183 URL: https://issues.apache.org/jira/browse/HIVE-17183 Project: Hive Issue Type: Sub-task Components: repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, bootstrap dump shall lead to data loss when any rename happens while dump in progress. This feature can be supported in next phase development as it need proper design to keep track of renamed tables/partitions. So, for time being, we shall disable rename operations when bootstrap dump in progress to avoid any inconsistent state. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17100) Improve HS2 operation logs for REPL commands.
Sankar Hariappan created HIVE-17100: --- Summary: Improve HS2 operation logs for REPL commands. Key: HIVE-17100 URL: https://issues.apache.org/jira/browse/HIVE-17100 Project: Hive Issue Type: Sub-task Components: HiveServer2, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 It is necessary to log the progress the replication tasks in a structured manner as follows. Bootstrap Dump: At the start of bootstrap dump, will add one log with below details. * Database Name * Dump Type (BOOTSTRAP) * (Estimated) Total number of tables/views to dump * (Estimated) Total number of functions to dump. * Dump Start Time After each table dump, will add a log as follows * Table/View Name * Type (TABLE/VIEW/MATERIALIZED_VIEW) * Table dump end time * Table dump progress. Format is Table sequence no/(Estimated) Total number of tables and views. After each function dump, will add a log as follows * Function Name * Function dump end time * Function dump progress. Format is Function sequence no/(Estimated) Total number of functions. After completion of all dumps, will add a log as follows to consolidate the dump. * Database Name. * Dump Type (BOOTSTRAP). * Dump End Time. * (Actual) Total number of tables/views dumped. * (Actual) Total number of functions dumped. * Dump Directory. * Last Repl ID of the dump. Note: The actual and estimated number of tables/functions may not match if any table/function is dropped when dump in progress. Bootstrap Load: At the start of bootstrap load, will add one log with below details. * Database Name * Dump directory * Load Type (BOOTSTRAP) * Total number of tables/views to load * Total number of functions to load. * Load Start Time After each table load, will add a log as follows * Table/View Name * Type (TABLE/VIEW/MATERIALIZED_VIEW) * Table load completion time * Table load progress. Format is Table sequence no/Total number of tables and views. After each function load, will add a log as follows * Function Name * Function load completion time * Function load progress. Format is Function sequence no/Total number of functions. After completion of all dumps, will add a log as follows to consolidate the load. * Database Name. * Load Type (BOOTSTRAP). * Load End Time. * Total number of tables/views loaded. * Total number of functions loaded. * Last Repl ID of the loaded database. Incremental Dump: At the start of database dump, will add one log with below details. * Database Name * Dump Type (INCREMENTAL) * (Estimated) Total number of events to dump. * Dump Start Time After each event dump, will add a log as follows * Event ID * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc) * Event dump end time * Event dump progress. Format is Event sequence no/ (Estimated) Total number of events. After completion of all event dumps, will add a log as follows. * Database Name. * Dump Type (INCREMENTAL). * Dump End Time. * (Actual) Total number of events dumped. * Dump Directory. * Last Repl ID of the dump. Note: The estimated number of events can be terribly inaccurate with actual number as we don’t have the number of events upfront until we read from metastore NotificationEvents table. Incremental Load: At the start of incremental load, will add one log with below details. * Target Database Name * Dump directory * Load Type (INCREMENTAL) * Total number of events to load * Load Start Time After each event load, will add a log as follows * Event ID * Event Type (CREATE_TABLE, DROP_TABLE, ALTER_TABLE, INSERT etc) * Target Table/View/Function/Name * Target Partition Name (in case of partition operations such as ADD_PARTITION, DROP_PARTITION, ALTER_PARTITION etc. For other operations, it will be “null") * Event load end time * Event load progress. Format is Event sequence no/ Total number of events. After completion of all event loads, will add a log as follows to consolidate the load. * Target Database Name. * Load Type (INCREMENTAL). * Load End Time. * Total number of events loaded. * Last Repl ID of the loaded database. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17021) Support replication of concatenate operation.
Sankar Hariappan created HIVE-17021: --- Summary: Support replication of concatenate operation. Key: HIVE-17021 URL: https://issues.apache.org/jira/browse/HIVE-17021 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 We need to handle cases like ALTER TABLE ... CONCATENATE that also change the files on disk, and potentially treat them similar to INSERT OVERWRITE, as it does something equivalent to a compaction. Note that a ConditionalTask might also be fired at the end of inserts at the end of a tez task (or other exec engine) if appropriate HiveConf settings are set, to automatically do this operation - these also need to be taken care of for replication. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16990) REPL LOAD should update last repl ID only after successful copy of data files.
Sankar Hariappan created HIVE-16990: --- Summary: REPL LOAD should update last repl ID only after successful copy of data files. Key: HIVE-16990 URL: https://issues.apache.org/jira/browse/HIVE-16990 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 For REPL LOAD operations that includes both metadata and data changes should follow the below rule. 1. Copy the metadata excluding the last repl ID. 2. Copy the data files 3. If Step 1 and 2 are successful, then update the last repl ID of the object. This rule will allow the the failed events to be re-applied by REPL LOAD and ensures no data loss due to failures. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16901) Distcp optimization - One distcp per ReplCopyTask
Sankar Hariappan created HIVE-16901: --- Summary: Distcp optimization - One distcp per ReplCopyTask Key: HIVE-16901 URL: https://issues.apache.org/jira/browse/HIVE-16901 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Fix For: 3.0.0 Currently, if a ReplCopyTask is created to copy a list of files, then distcp is invoked for each and every file. Instead, need to pass the list of source files to be copied to distcp tool which basically copies the files in parallel and hence gets lot of performance gain. If the copy of list of files fail, then traverse the destination directory to see which file is missing and checksum mismatches, then trigger copy of those files one by one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16813) Incremental REPL LOAD should load the events in the same sequence as it is dumped.
Sankar Hariappan created HIVE-16813: --- Summary: Incremental REPL LOAD should load the events in the same sequence as it is dumped. Key: HIVE-16813 URL: https://issues.apache.org/jira/browse/HIVE-16813 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, incremental REPL DUMP use $dumpdir/ to dump the metadata and data files corresponding to the event. The event is dumped in the same sequence in which it was generated. Now, REPL LOAD, lists the directories inside $dumpdir using listStatus and sort it using compareTo algorithm of FileStatus class which doesn't check the length before sorting it alphabetically. Due to this, the event-100 is processed before event-99 and hence making the replica database unreliable. Need to use a customised compareTo algorithm to sort the FileStatus. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16785) Ensure replication actions are idempotent if any series of events are applied again.
Sankar Hariappan created HIVE-16785: --- Summary: Ensure replication actions are idempotent if any series of events are applied again. Key: HIVE-16785 URL: https://issues.apache.org/jira/browse/HIVE-16785 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Some of the events(ALTER, RENAME, TRUNCATE) are not idempotent and hence leads to failure of REPL LOAD if applied twice or applied on an object which is latest than current event. For example, if TRUNCATE is applied on a table which is already dropped will fail instead of noop. Also, need to consider the scenario where the object is missing while applying an event. For example, if RENAME_TABLE event is applied on target where the old table is missing should validate if table should be recreated or should treat the event as noop. This can be done by verifying the DB level last repl ID against the current event ID. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16750) Support change management for rename table/partition.
Sankar Hariappan created HIVE-16750: --- Summary: Support change management for rename table/partition. Key: HIVE-16750 URL: https://issues.apache.org/jira/browse/HIVE-16750 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, rename table/partition updates the data location by renaming the directory which is equivalent to moving files to new path and delete old path. So, this should trigger move of files into $CMROOT. Scenario: 1. Create a table (T1) 2. Insert a record 3. Rename the table(T1 -> T2) 4. Repl Dump till Insert. 5. Repl Load from the dump. 6. Target DB should have table T1 with the record. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16727) REPL DUMP for insert event should't fail if the table is already dropped.
Sankar Hariappan created HIVE-16727: --- Summary: REPL DUMP for insert event should't fail if the table is already dropped. Key: HIVE-16727 URL: https://issues.apache.org/jira/browse/HIVE-16727 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, insert event doesn't log the table object as part of event notification message. During dump, the table object is obtained from metastore which can be null if the table is already dropped and hence REPL DUMP fails. Steps: 1. Bootstrap dump/load with a table. 2. Insert into the table. 3. Drop the table. 4. REPL DUMP (incremental). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-16706) Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed after fetching the partitions in a batch.
Sankar Hariappan created HIVE-16706: --- Summary: Bootstrap REPL DUMP shouldn't fail when a partition is dropped/renamed after fetching the partitions in a batch. Key: HIVE-16706 URL: https://issues.apache.org/jira/browse/HIVE-16706 Project: Hive Issue Type: Sub-task Components: Hive, repl Affects Versions: 2.1.0 Reporter: Sankar Hariappan Assignee: Sankar Hariappan Currently, bootstrap REPL DUMP gets the partitions in a batch and then iterate through it. If any partition is dropped/renamed during iteration, it will lead to failure/exception. In this case, the partition should be skipped from dump and also need to ensure no failure of REPL DUMP and the subsequent incremental dump should ensure the consistent state of the table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)