gaozhangmin opened a new issue, #3408:
URL: https://github.com/apache/bookkeeper/issues/3408

   Our prod environment went wrong last week, all bookies were killed because 
of direct memory OOM, this happened after one bookie's disk was broken, we 
tried to offline this bookie. After auditBookie triggered, all the bookies 
Direct Memory keep increase, it seem that, there is memory leak problem.
   
   The ReplicateWorker log: x.x.x.x is the ip of lost bookie
   ```
   022-07-14 21:50:41.721 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Could not connect to 
bookie: null/x.x.x.x:3181, current state CONNECTING : 
   2022-07-14 21:50:41.723 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1502 from bookie: x.x.x.x:3181
   2022-07-14 21:50:41.724 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:41.724 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1506 from bookie: x.x.x.x:3181
   2022-07-14 21:50:41.724 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:41.724 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1510 from bookie: x.x.x.x:3181
   2022-07-14 21:50:41.725 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:41.725 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1514 from bookie: x.x.x.x:3181
   2022-07-14 21:50:41.725 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:41.725 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1518 from bookie: x.x.x.x:3181
   2022-07-14 21:50:41.725 [ReplicationWorker] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   
   2022-07-14 21:50:42.403 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1974 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.440 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1978 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.525 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1982 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.593 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1986 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.665 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1990 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.706 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1994 from bookie: x.x.x.x:3181
   2022-07-14 21:50:42.776 [BookKeeperClientWorker-OrderedExecutor-8-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie handle is not 
available while reading L61080496 E1998 from bookie: x.x.x.x:3181
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   2022-07-14 21:50:44.271 [BookKeeperClientWorker-OrderedExecutor-8-0] ERROR 
org.apache.bookkeeper.proto.PerChannelBookieClient - Cannot connect to 
x.x.x.x:3181 as endpoint resolution failed (probably bookie is down) err 
org.apache.bookkeeper.proto.BookieAddressResolver$BookieIdNotResolvedException: 
Cannot resolve bookieId x.x.x.x:3181, bookie does not exist or it is not running
   
   2022-07-14 22:10:19.830 [BookKeeperClientWorker-OrderedExecutor-41-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout 
while reading L60558419 E359 from bookie: 10.71.168.13:3181
   2022-07-14 22:10:19.830 [BookKeeperClientWorker-OrderedExecutor-41-0] ERROR 
org.apache.bookkeeper.client.LedgerFragmentReplicator - BK error reading ledger 
entry: 434
   2022-07-14 22:10:19.831 [BookKeeperClientWorker-OrderedExecutor-41-0] ERROR 
org.apache.bookkeeper.proto.BookkeeperInternalCallbacks - Error in multi 
callback : -23
   
   
   is (-1, rc = null)
   2022-07-14 22:10:19.830 [BookKeeperClientWorker-OrderedExecutor-41-0] INFO  
org.apache.bookkeeper.client.PendingReadOp - Error: Bookie operation timeout 
while reading L60558419 E378 from bookie: 1.1.1.1:3181
   2022-07-14 22:10:19.830 [BookKeeperClientWorker-OrderedExecutor-41-0] ERROR 
org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: 
L60558419 E378-E378, Sent to [x.x.x.x:3181, 1.1.1.1:3181], Heard from [] : 
bitset = {}, Error = 'Bookie operation timeout'. First unread entry is (-1, rc 
= null)
   
   
   ```
   
   And there are bookies quarantined by brokers continuous, all bookies are 
crashed at last.
   
   
   
![image](https://user-images.githubusercontent.com/9278488/179348752-21cbf405-1f80-4ecc-917e-8524f2742bb9.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to