hangc0276 commented on issue #3759: URL: https://github.com/apache/bookkeeper/issues/3759#issuecomment-1413443616
Hi @gaozhangmin , thanks for raising this issue. https://github.com/apache/bookkeeper/pull/3239 mark the `OPEN` state `Empty` ledgers, whose ensemble has dead bookies, as under-replicated in Auditor's checkAllLedgers stage. IMO, it makes sense, otherwise, those ledgers can't be replicated and block the decommissioning process. When those `OPEN` state `Empty` ledgers are fenced, the bookie client will trigger a fence readLAC, which will fall into the `locationsDb.getFloor(maxEntryId.array)` call, and the key does not exist in the RocksDB due to the ledger is empty. From the performance picture you provided, the `locationsDb.getFloor(maxEntryId.array)` costs 316s, which is unacceptable. The root cause may be related to the RocksDB, not the `OPEN` state `Empty` ledgers marked as under-replicated behavior. I have a few questions about this issue. - Would you please the size of your RocksDB, including storage size and the number of keys? - For those `OPEN` state and not `Empty` ledgers, if you call getLAC, it will also fall into the `locationsDb.getFloor(maxEntryId.array)`, does it also cost so much time? - Would you write a test to open your RocksDB, and call `locationsDb.getFloor(maxEntryId.array)` for an existed and non-existed legerId, to see the time cost? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
