horizonzy opened a new pull request, #4081:
URL: https://github.com/apache/bookkeeper/pull/4081
### Motivation
In some tests, the execution time can be quite long, even though the test
methods have already finished running. For example, in the
AuditorRollingRestartTest#testAuditingDuringRollingRestart() test.
I printed out the stack trace and found that it is stuck at
executor.awaitTermination(30, TimeUnit.SECONDS) i
n the Auditor. In some tests, the executor in the Auditor is getting stuck,
causing the test to wait for an additional 30 seconds.
```
"Time-limited test" #18 daemon prio=5 os_prio=31 cpu=762.48ms elapsed=69.85s
tid=0x000000011fb87200 nid=0x871b waiting on condition [0x0000000173b39000]
java.lang.Thread.State: TIMED_WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x00002000063a18d0> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at
java.util.concurrent.locks.LockSupport.parkNanos([email protected]/LockSupport.java:252)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos([email protected]/AbstractQueuedSynchronizer.java:1672)
at
java.util.concurrent.ThreadPoolExecutor.awaitTermination([email protected]/ThreadPoolExecutor.java:1464)
at
java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination([email protected]/Executors.java:743)
at org.apache.bookkeeper.replication.Auditor.shutdown(Auditor.java:623)
at
org.apache.bookkeeper.replication.AuditorElector.shutdown(AuditorElector.java:246)
at
org.apache.bookkeeper.replication.AutoRecoveryMain.shutdown(AutoRecoveryMain.java:157)
at
org.apache.bookkeeper.replication.AutoRecoveryMain.shutdown(AutoRecoveryMain.java:143)
at
org.apache.bookkeeper.test.BookKeeperClusterTestCase$ServerTester.stopAutoRecovery(BookKeeperClusterTestCase.java:916)
at
org.apache.bookkeeper.test.BookKeeperClusterTestCase.stopReplicationService(BookKeeperClusterTestCase.java:766)
at
org.apache.bookkeeper.test.BookKeeperClusterTestCase.stopBKCluster(BookKeeperClusterTestCase.java:278)
at
org.apache.bookkeeper.test.BookKeeperClusterTestCase.tearDown(BookKeeperClusterTestCase.java:203)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke0([email protected]/Native
Method)
at
jdk.internal.reflect.NativeMethodAccessorImpl.invoke([email protected]/NativeMethodAccessorImpl.java:77)
at
jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke([email protected]/DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke([email protected]/Method.java:568)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
org.junit.internal.runners.statements.RunAfters.invokeMethod(RunAfters.java:46)
at
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at
java.util.concurrent.FutureTask.run([email protected]/FutureTask.java:264)
at java.lang.Thread.run([email protected]/Thread.java:833)
"AuditorBookie-127.0.0.1:59740" #306 daemon prio=5 os_prio=31 cpu=0.24ms
elapsed=50.97s tid=0x000000011f1a1400 nid=0x13407 waiting on condition
[0x00000004e8a36000]
java.lang.Thread.State: WAITING (parking)
at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
- parking to wait for <0x00002000050dcff0> (a
java.util.concurrent.CountDownLatch$Sync)
at
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:211)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire([email protected]/AbstractQueuedSynchronizer.java:715)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly([email protected]/AbstractQueuedSynchronizer.java:1047)
at
java.util.concurrent.CountDownLatch.await([email protected]/CountDownLatch.java:230)
at
org.apache.bookkeeper.replication.ReplicationEnableCb.await(ReplicationEnableCb.java:56)
at
org.apache.bookkeeper.replication.AuditorBookieCheckTask.waitIfLedgerReplicationDisabled(AuditorBookieCheckTask.java:181)
at
org.apache.bookkeeper.replication.AuditorBookieCheckTask.auditBookies(AuditorBookieCheckTask.java:104)
at
org.apache.bookkeeper.replication.AuditorBookieCheckTask.startAudit(AuditorBookieCheckTask.java:86)
at
org.apache.bookkeeper.replication.AuditorBookieCheckTask.runTask(AuditorBookieCheckTask.java:64)
at
org.apache.bookkeeper.replication.AuditorTask.run(AuditorTask.java:72)
at
org.apache.bookkeeper.replication.AuditorBookieCheckTask.run(AuditorBookieCheckTask.java:40)
at
java.util.concurrent.Executors$RunnableAdapter.call([email protected]/Executors.java:539)
at
java.util.concurrent.FutureTask.runAndReset([email protected]/FutureTask.java:305)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run([email protected]/ScheduledThreadPoolExecutor.java:305)
at
java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
at java.lang.Thread.run([email protected]/Thread.java:833)
```
Therefore, we can modify the behavior of awaitTermination using a system
variable, so that it doesn't wait for 30 seconds when running in a single-sided
mode.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]