[
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashutosh Bapat updated HIVE-21880:
----------------------------------
Attachment: HIVE-21880.01.patch
Status: Patch Available (was: Open)
The code in getNextNotification() just checks whether the next event has the
expected event id. This check may fail when there are multiple events with the
same event id or when event ids are missing. When the test fails, it fails
because there multiple events with the same event id.
We use derby database as backing db for metastore. Derby doesn't lock the row
being selected with FOR UPDATE clause. addNotificationLog() and
addNotificationEvent(), both functions, rely on the this behaviour to generate
monotonically increasing sequential event ids. Since the row is not locked, we
could fetch the same event id multiple times and then increment it to the same
value multiple times. That can cause the event ids to progress in unreliable
manner. So for Derby we lock the NOTIFICATION_SEQUENCE table instead of using
FOR UPDATE.
Note: TxnHandler uses a different behaviour to simulate the effect of FOR
UPDATE on Derby; it uses a JVM wide mutex for that. TxnHandler is not available
always esp. when there are no ACID tables involved, so we need to move that
mutex out of TxnHandler to a place common to DbNotificationListener and
TxnHandler e.g. SQLGenerater and also have to take care of mutex's reentrant
behaviour. Furthermore such a mutex wouldn't work when there are metastores are
running in separate JVMs.
Since the test in Subject is flaky, I have added another test which reliably
reproduces this behaviour.
> Enable flaky test
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-21880
> URL: https://issues.apache.org/jira/browse/HIVE-21880
> Project: Hive
> Issue Type: Bug
> Components: repl
> Affects Versions: 4.0.0
> Reporter: Sankar Hariappan
> Assignee: Ashutosh Bapat
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-21880.01.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Need tp enable
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
> which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta
> store.
> at
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
> at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
> at
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
> at
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
> at
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
> at
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
> at
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
> at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
> at
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
> at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
> at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
> at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
> at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
> at
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
> at
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227)
> at
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282)
> at
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265)
> at
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289)
> at
> org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
> at org.junit.rules.RunRules.evaluate(RunRules.java:20)
> at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
> at
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
> at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
> at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
> at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
> at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
> at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
> at
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
> at
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
> at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> at
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> at
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
> at
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
> at
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
> at
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}
> https://builds.apache.org/job/PreCommit-HIVE-Build/17591/testReport/org.apache.hadoop.hive.ql.parse/TestReplicationScenariosAcidTablesBootstrap/testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites/
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)