**BUG REPORT** **What did you do?**
In our cluster, Auditor run periodic check only once. If interval expires after first periodic check, auditor will not run periodic check. If we want to run periodic check again, we have to restart auditor bookie. Auditor's thread dump It seems that `AuditorBookie` thread stop by `CountDownLatch` with some reason. https://gist.github.com/hrsakai/d65e8e2cd511173232b1010a9bbdf126 ``` "AuditorBookie-XXXXX:3181" #40 daemon prio=5 os_prio=0 tid=0x00007f049c117830 nid=0x5da4 waiting on condition [0x00007f0477dfc000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000e04e54f8> (a java.util.concurrent.CountDownLatch$Sync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304) at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231) at org.apache.bookkeeper.replication.Auditor.checkAllLedgers(Auditor.java:696) at org.apache.bookkeeper.replication.Auditor$5.run(Auditor.java:359) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ``` I saw many timed-out logs in Auditor's log file. ``` 23:14:56.769 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO o.a.b.proto.PerChannelBookieClient - Timed-out 2 operations to channel [id: 0xa17320e1, L:/AAAA:38234 - R:XXXX.co.jp/BBBB:3181] for BBBB:3181 23:14:56.921 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO o.a.b.proto.PerChannelBookieClient - Timed-out 48 operations to channel [id: 0x359c20bd, L:/AAAA:38222 - R:BBBB/BBBB:3181] for BBBB:3181 23:15:37.768 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO o.a.b.proto.PerChannelBookieClient - Timed-out 2 operations to channel [id: 0xa17320e1, L:/AAAA:38234 - R:XXXX.co.jp/BBBB:3181] for BBBB:3181 23:15:37.921 [BookKeeperClientScheduler-OrderedScheduler-0-0] INFO o.a.b.proto.PerChannelBookieClient - Timed-out 91 operations to channel [id: 0x359c20bd, L:/AAAA:38222 - R:BBBB/BBBB:3181] for BBBB:3181 ・ ・ ``` **What did you expect to see?** Auditor run periodic check after every interval expires. **What did you see instead?** Auditor run periodic check only once. **System configuration** BookKeeper version : 4.7.0 Number of Bookies: 5 Ensemble size: 2 Write quorum size: 2 Ack quorum size: 2 Priodic check interval: 1day [ Full content available at: https://github.com/apache/bookkeeper/issues/1578 ] This message was relayed via gitbox.apache.org for [email protected]
