[ 
https://issues.apache.org/jira/browse/HIVE-21880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Bapat updated HIVE-21880:
----------------------------------
    Attachment: HIVE-21880.01.patch
        Status: Patch Available  (was: Open)

The code in getNextNotification() just checks whether the next event has the 
expected event id. This check may fail when there are multiple events with the 
same event id or when event ids are missing. When the test fails, it fails 
because there multiple events with the same event id.

We use derby database as backing db for metastore. Derby doesn't lock the row 
being selected with FOR UPDATE clause. addNotificationLog() and 
addNotificationEvent(), both functions, rely on the this behaviour to generate 
monotonically increasing sequential event ids. Since the row is not locked, we 
could fetch the same event id multiple times and then increment it to the same 
value multiple times. That can cause the event ids to progress in unreliable 
manner. So for Derby we lock the NOTIFICATION_SEQUENCE table instead of using 
FOR UPDATE.

Note: TxnHandler uses a different behaviour to simulate the effect of FOR 
UPDATE on Derby; it uses a JVM wide mutex for that. TxnHandler is not available 
always esp. when there are no ACID tables involved, so we need to move that 
mutex out of TxnHandler to a place common to DbNotificationListener and 
TxnHandler e.g. SQLGenerater and also have to take care of mutex's reentrant 
behaviour. Furthermore such a mutex wouldn't work when there are metastores are 
running in separate JVMs.

Since the test in Subject is flaky, I have added another test which reliably 
reproduces this behaviour.

> Enable flaky test 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites.
> ---------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21880
>                 URL: https://issues.apache.org/jira/browse/HIVE-21880
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Ashutosh Bapat
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-21880.01.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Need tp enable 
> TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites
>  which is disabled as it is flaky and randomly failing with below error.
> {code}
> Error Message
> Notification events are missing in the meta store.
> Stacktrace
> java.lang.IllegalStateException: Notification events are missing in the meta 
> store.
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getNextNotification(HiveMetaStoreClient.java:3246)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:212)
>       at com.sun.proxy.$Proxy58.getNextNotification(Unknown Source)
>       at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$MSClientNotificationFetcher.getNextNotificationEvents(EventUtils.java:107)
>       at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.fetchNextBatch(EventUtils.java:159)
>       at 
> org.apache.hadoop.hive.ql.metadata.events.EventUtils$NotificationEventIterator.hasNext(EventUtils.java:189)
>       at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.incrementalDump(ReplDumpTask.java:231)
>       at 
> org.apache.hadoop.hive.ql.exec.repl.ReplDumpTask.execute(ReplDumpTask.java:121)
>       at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212)
>       at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:103)
>       at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2709)
>       at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2361)
>       at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2028)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1788)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1782)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:162)
>       at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:223)
>       at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.run(WarehouseInstance.java:227)
>       at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:282)
>       at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:265)
>       at 
> org.apache.hadoop.hive.ql.parse.WarehouseInstance.dump(WarehouseInstance.java:289)
>       at 
> org.apache.hadoop.hive.ql.parse.TestReplicationScenariosAcidTablesBootstrap.testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites(TestReplicationScenariosAcidTablesBootstrap.java:328)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
>       at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125)
>       at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413)
> {code}
> https://builds.apache.org/job/PreCommit-HIVE-Build/17591/testReport/org.apache.hadoop.hive.ql.parse/TestReplicationScenariosAcidTablesBootstrap/testBootstrapAcidTablesDuringIncrementalWithConcurrentWrites/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to