horizonzy opened a new issue, #3954: URL: https://github.com/apache/bookkeeper/issues/3954
**BUG REPORT** In LedgerMetadataIndex, it has two properties pendingLedgersUpdates and pendingDeletedLedgers. https://github.com/apache/bookkeeper/blob/5beb8948baa0f287e597ff2a29ec672ac2cb1220/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/LedgerMetadataIndex.java#L63-L66 When adding elements or removing elements, we will use a lock to avoid race conditions. https://github.com/apache/bookkeeper/blob/5beb8948baa0f287e597ff2a29ec672ac2cb1220/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/LedgerMetadataIndex.java#L121-L157 But when iterating it, we didn't use a lock. https://github.com/apache/bookkeeper/blob/5beb8948baa0f287e597ff2a29ec672ac2cb1220/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/LedgerMetadataIndex.java#L297-L315 There is a case. At line_301, the pendingLedgersUpdates is not empty. Before line_302, the pendingLedgersUpdates elements be removed. Then line_302 poll a null element, line_303 throws NPE. Then the bookie shutdown. ``` 2023-05-12T16:58:26,205 - INFO - [db-storage-cleanup-5-1:EntryLocationIndex@220] - Deleting indexes for ledgers: [80, 98, 83, 101, 86, 104, 89, 107, 92, 95] 2023-05-12T16:58:26,212 - INFO - [db-storage-cleanup-5-1:EntryLocationIndex@244] - Deleted indexes from 10 ledgers in 0.006 seconds 2023-05-12T16:58:26,203 - ERROR - [SyncThread-7-1:SyncThread@111] - Exception in SyncThread java.lang.NullPointerException: null at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.flush(LedgerMetadataIndex.java:303) ~[classes/:?] at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.checkpoint(SingleDirectoryDbLedgerStorage.java:836) ~[classes/:?] at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.checkpoint(DbLedgerStorage.java:392) ~[classes/:?] at org.apache.bookkeeper.bookie.SyncThread.checkpoint(SyncThread.java:166) ~[classes/:?] at org.apache.bookkeeper.bookie.SyncThread.lambda$doCheckpoint$1(SyncThread.java:108) ~[classes/:?] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[?:1.8.0_333] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_333] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) ~[?:1.8.0_333] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) ~[?:1.8.0_333] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_333] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_333] at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[netty-common-4.1.89.Final.jar:4.1.89.Final] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_333] 2023-05-12T16:58:26,227 - ERROR - [SyncThread-7-1:BookieImpl$6@751] - Fatal error reported by ledgerDirsManager 2023-05-12T16:58:26,229 - INFO - [SyncThread-7-1:BookieImpl@828] - Triggering shutdown of Bookie-3183 with exitCode 5 2023-05-12T16:58:26,230 - INFO - [BookieShutdownTrigger:BookieImpl@852] - Shutting down Bookie-3183 with exitCode 5 2023-05-12T16:58:26,232 - INFO - [BookieShutdownTrigger:BookieImpl@861] - Turning bookie to read only during shut down 2023-05-12T16:58:26,232 - INFO - [BookieShutdownTrigger:SyncThread@217] - Shutting down SyncThread 2023-05-12T16:58:26,234 - WARN - [BookieWriteThreadPool-OrderedExecutor-0-0:WriteEntryProcessor@61] - BookieServer is running in readonly mode, so rejecting the request from the client! 2023-05-12T16:58:26,245 - INFO - [SyncThread-7-1:SyncThread@149] - Flush ledger storage at checkpoint CheckpointList{checkpoints=[LogMark: logFileId - 1683880106189 , logFileOffset - 88196608]}. 2023-05-12T16:58:26,247 - INFO - [BookieShutdownTrigger:Journal@1225] - Shutting down Journal 2023-05-12T16:58:26,249 - INFO - [ForceWriteThread:Journal$ForceWriteThread@525] - ForceWrite thread interrupted 2023-05-12T16:58:26,250 - INFO - [BookieJournal-3183:Journal@1198] - Journal exits when shutting down 2023-05-12T16:58:26,250 - INFO - [BookieJournal-3183:Journal@1210] - Journal exited loop! 2023-05-12T16:58:26,250 - INFO - [BookieShutdownTrigger:Journal@1235] - Finished Shutting down Journal thread 2023-05-12T16:58:26,277 - INFO - [BookieShutdownTrigger:GarbageCollectorThread@666] - Shutting down GarbageCollectorThread 2023-05-12T16:58:26,278 - INFO - [BookieShutdownTrigger:DefaultEntryLogger@1166] - Stopping EntryLogger 2023-05-12T16:58:26,286 - INFO - [BookieShutdownTrigger:EntryLoggerAllocator@239] - Stopped entry logger preallocator. 2023-05-12T16:58:26,296 - WARN - [BookieWriteThreadPool-OrderedExecutor-0-0:WriteEntryProcessor@61] - BookieServer is running in readonly mode, so rejecting the request from the client! 2023-05-12T16:58:26,299 - WARN - [BookieWriteThreadPool-OrderedExecutor-0-0:WriteEntryProcessor@61] - BookieServer is running in readonly mode, so rejecting the request from the client! 2023-05-12T16:58:26,304 - INFO - [BookieShutdownTrigger:LedgerDirsMonitor@224] - Shutting down LedgerDirsMonitor 2023-05-12T16:58:26,404 - INFO - [BookieDeathWatcher-3183:BookieServer$DeathWatcher@274] - BookieDeathWatcher noticed the bookie is not running any more, exiting the watch loop! 2023-05-12T16:58:26,405 - ERROR - [BookieDeathWatcher-3183:ComponentStarter@75] - Triggered exceptionHandler of Component: bookie-server because of Exception in Thread: Thread[BookieDeathWatcher-3183,5,main] java.lang.RuntimeException: Bookie is not running any more at org.apache.bookkeeper.proto.BookieServer$DeathWatcher.run(BookieServer.java:278) ~[classes/:?] 2023-05-12T16:58:26,408 - INFO - [component-shutdown-thread:ComponentStarter$ComponentShutdownHook@44] - Closing component bookie-server in shutdown hook. 2023-05-12T16:58:26,416 - INFO - [component-shutdown-thread:BookieServer@191] - Shutting down BookieServer 2023-05-12T16:58:26,416 - INFO - [component-shutdown-thread:BookieNettyServer@438] - Shutting down BookieNettyServer 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-20:BookieRequestHandler@71] - Channels disconnected: [id: 0x82340670, L:/127.0.0.1:3183 ! R:/127.0.0.1:55944] 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-9:BookieRequestHandler@71] - Channels disconnected: [id: 0x72bff9a6, L:/127.0.0.1:3183 ! R:/127.0.0.1:55905] 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-3:BookieRequestHandler@71] - Channels disconnected: [id: 0x80de0186, L:/127.0.0.1:3183 ! R:/127.0.0.1:55947] 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-7:BookieRequestHandler@71] - Channels disconnected: [id: 0x9667577a, L:/127.0.0.1:3183 ! R:/127.0.0.1:55951] 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-16:BookieRequestHandler@71] - Channels disconnected: [id: 0x4977044e, L:/127.0.0.1:3183 ! R:/127.0.0.1:55912] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-6:BookieRequestHandler@71] - Channels disconnected: [id: 0x2df5b8dc, L:/127.0.0.1:3183 ! R:/127.0.0.1:55901] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-7:BookieRequestHandler@71] - Channels disconnected: [id: 0xbbfdf14c, L:/127.0.0.1:3183 ! R:/127.0.0.1:55903] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-9:BookieRequestHandler@71] - Channels disconnected: [id: 0x53232dca, L:/127.0.0.1:3183 ! R:/127.0.0.1:55953] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-11:BookieRequestHandler@71] - Channels disconnected: [id: 0x30c7f002, L:/127.0.0.1:3183 ! R:/127.0.0.1:55907] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-2:BookieRequestHandler@71] - Channels disconnected: [id: 0x449c23e6, L:/127.0.0.1:3183 ! R:/127.0.0.1:55946] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-3:BookieRequestHandler@71] - Channels disconnected: [id: 0xdb59e76c, L:/127.0.0.1:3183 ! R:/127.0.0.1:55895] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-11:BookieRequestHandler@71] - Channels disconnected: [id: 0xa6c03d86, L:/127.0.0.1:3183 ! R:/127.0.0.1:55955] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-6:BookieRequestHandler@71] - Channels disconnected: [id: 0x3ed94afa, L:/127.0.0.1:3183 ! R:/127.0.0.1:55950] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-10:BookieRequestHandler@71] - Channels disconnected: [id: 0x83c2bcfd, L:/127.0.0.1:3183 ! R:/127.0.0.1:55906] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-10:BookieRequestHandler@71] - Channels disconnected: [id: 0xf2cee6a9, L:/127.0.0.1:3183 ! R:/127.0.0.1:55954] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-8:BookieRequestHandler@71] - Channels disconnected: [id: 0x6030f9a7, L:/127.0.0.1:3183 ! R:/127.0.0.1:55952] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-8:BookieRequestHandler@71] - Channels disconnected: [id: 0x952d7506, L:/127.0.0.1:3183 ! R:/127.0.0.1:55904] 2023-05-12T16:58:26,427 - INFO - [bookie-io-8-4:BookieRequestHandler@71] - Channels disconnected: [id: 0xdf622342, L:/127.0.0.1:3183 ! R:/127.0.0.1:55948] 2023-05-12T16:58:26,428 - INFO - [bookie-io-8-4:BookieRequestHandler@71] - Channels disconnected: [id: 0xe6061e70, L:/127.0.0.1:3183 ! R:/127.0.0.1:55898] 2023-05-12T16:58:26,426 - INFO - [bookie-io-8-18:BookieRequestHandler@71] - Channels disconnected: [id: 0xda915641, L:/127.0.0.1:3183 ! R:/127.0.0.1:55914] 2023-05-12T16:58:26,432 - INFO - [bookie-io-8-13:BookieRequestHandler@71] - Channels disconnected: [id: 0xef6758a8, L:/127.0.0.1:3183 ! R:/127.0.0.1:55909] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-17:BookieRequestHandler@71] - Channels disconnected: [id: 0x48b99c12, L:/127.0.0.1:3183 ! R:/127.0.0.1:55913] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-15:BookieRequestHandler@71] - Channels disconnected: [id: 0x6812ca59, L:/127.0.0.1:3183 ! R:/127.0.0.1:55911] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-14:BookieRequestHandler@71] - Channels disconnected: [id: 0xeb6a9658, L:/127.0.0.1:3183 ! R:/127.0.0.1:55958] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-13:BookieRequestHandler@71] - Channels disconnected: [id: 0x9dff4ccd, L:/127.0.0.1:3183 ! R:/127.0.0.1:55957] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-14:BookieRequestHandler@71] - Channels disconnected: [id: 0xe2f781e7, L:/127.0.0.1:3183 ! R:/127.0.0.1:55910] 2023-05-12T16:58:26,433 - INFO - [bookie-io-8-1:BookieRequestHandler@71] - Channels disconnected: [id: 0x18bebbbe, L:/127.0.0.1:3183 ! R:/127.0.0.1:55945] 2023-05-12T16:58:26,434 - INFO - [bookie-io-8-5:BookieRequestHandler@71] - Channels disconnected: [id: 0x9d61caec, L:/127.0.0.1:3183 ! R:/127.0.0.1:55900] 2023-05-12T16:58:26,434 - INFO - [bookie-io-8-5:BookieRequestHandler@71] - Channels disconnected: [id: 0x41b4cc4a, L:/127.0.0.1:3183 ! R:/127.0.0.1:55949] 2023-05-12T16:58:26,434 - INFO - [bookie-io-8-12:BookieRequestHandler@71] - Channels disconnected: [id: 0x63ced3ef, L:/127.0.0.1:3183 ! R:/127.0.0.1:55956] 2023-05-12T16:58:26,434 - INFO - [bookie-io-8-12:BookieRequestHandler@71] - Channels disconnected: [id: 0xc043c3b4, L:/127.0.0.1:3183 ! R:/127.0.0.1:55908] 2023-05-12T16:58:26,437 - INFO - [bookie-io-8-19:BookieRequestHandler@71] - Channels disconnected: [id: 0x25015c7b, L:/127.0.0.1:3183 ! R:/127.0.0.1:55943] 2023-05-12T16:58:26,455 - INFO - [component-shutdown-thread:BookieRequestProcessor@268] - Closing RequestProcessor 2023-05-12T16:58:26,463 - INFO - [component-shutdown-thread:BookieRequestProcessor@276] - Closed RequestProcessor 2023-05-12T16:58:26,482 - INFO - [BookieStateManagerService-0-EventThread:ZKRegistrationClient$BookieServiceInfoCacheInvalidationWatcher@498] - Invalidate cache for 127.0.0.1:3183 2023-05-12T16:58:26,485 - WARN - [BookieStateManagerService-0-EventThread:ZKRegistrationClient$WatchTask@116] - Failed to schedule watch bookies task java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask@4a928729 rejected from org.apache.bookkeeper.common.util.SingleThreadSafeScheduledExecutorService@f90101f[Shutting down, pool size = 1, active threads = 0, queued tasks = 9, completed tasks = 6] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063) ~[?:1.8.0_333] at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830) ~[?:1.8.0_333] at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326) ~[?:1.8.0_333] at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533) ~[?:1.8.0_333] at org.apache.bookkeeper.common.util.SingleThreadSafeScheduledExecutorService.schedule(SingleThreadSafeScheduledExecutorService.java:55) ~[classes/:?] at com.google.common.util.concurrent.MoreExecutors$ScheduledListeningDecorator.schedule(MoreExecutors.java:643) ~[guava-31.0.1-jre.jar:?] at org.apache.bookkeeper.common.util.BoundedScheduledExecutorService.schedule(BoundedScheduledExecutorService.java:62) ~[classes/:?] at org.apache.bookkeeper.common.util.OrderedScheduler.schedule(OrderedScheduler.java:222) ~[classes/:?] at org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.scheduleWatchTask(ZKRegistrationClient.java:114) ~[classes/:?] at org.apache.bookkeeper.discover.ZKRegistrationClient$WatchTask.process(ZKRegistrationClient.java:163) ~[classes/:?] at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:578) ~[zookeeper-3.8.1.jar:3.8.1] at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:553) ~[zookeeper-3.8.1.jar:3.8.1] 2023-05-12T16:58:26,584 - INFO - [component-shutdown-thread:ZooKeeper@1232] - Session: 0x1001815954f0003 closed 2023-05-12T16:58:26,584 - INFO - [main-EventThread:ClientCnxn$EventThread@568] - EventThread shut down for session: 0x1001815954f0003 2023-05-12T16:58:26,584 - INFO - [component-shutdown-thread:ComponentStarter$ComponentShutdownHook@47] - Closed component bookie-server in shutdown hook successfully. Exiting. Process finished with exit code 0 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
