RaulGracia opened a new issue #2482:
URL: https://github.com/apache/bookkeeper/issues/2482


   **BUG REPORT**
   
   ***Describe the bug***
   We have found a problem, which we believe to be mostly related to the 
Bookkeeper client. The problem is as follows: we deploy a 3-Bookie ensemble in 
Kubernetes along with Pravega. Then, we reboot in purpose one of the Kubernetes 
nodes and this forces the restart of all pods running on it, including one 
Bookie. In such deployments, we have multiple Segment Stores connected to 
Bookkeeper. To start with, almost all of them correctly detected that a Bookie 
was down right away (`2020-10-23 05:24:15` server time):
   ```
   2020-10-23 05:24:15,706 34559029 [bookkeeper-io-3-2] INFO  
o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 
0xa020271c, L:/xxx.176:48038 ! 
R:bookie-0.bookie-headless.pravega.svc.cluster.local/xxx.146:3181]
   ```
   ```
   2020-10-23 05:24:15,707 34567544 [bookkeeper-io-3-2] INFO  
o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 
0x578a8730, L:/xxx.92:51952 ! 
R:bookie-0.bookie-headless.pravega.svc.cluster.local/xxx.146:3181]
   ```
   ```
   2020-10-23 05:24:15,707 34546304 [bookkeeper-io-3-1] INFO  
o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 
0x7da89b7f, L:/xxx.207:59078 ! 
R:bookie-0.bookie-headless.pravega.svc.cluster.local/xxx.146:3181]
   ```
   One of the Segment Stores closed the channel when the restarted Bookie was 
up again (`2020-10-23 05:41:22` server time):
   ```
   2020-10-23 05:41:22,232 35606108 [bookkeeper-io-3-1] INFO  
o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 
0x3f5b5d81, L:/xxx.9:59906 ! 
R:bookie-0.bookie-headless.pravega.svc.cluster.local/xxx.146:3181]
   ```
   So far so good, as the system does not introduce any additional delay to 
that related to the time taken by the entire node to reboot. However, the 
problem is related to one of the Segment Stores, whose _Bookkeeper client took 
15 additional minutes to close the Bookie channel_:
   ```
   2020-10-23 05:57:19,762 36520312 [bookkeeper-io-3-2] INFO  
o.a.b.proto.PerChannelBookieClient - Disconnected from bookie channel [id: 
0x9796f294, L:/xxx.68:40938 ! 
R:bookie-0.bookie-headless.pravega.svc.cluster.local/xxx.146:3181]
   ```
   The point is that it seems that _it is not until the Bookie channel is 
closed that the Bookkeeper client resolves the name of the Bookie again to the 
new IP_. Therefore, for 15 minutes, the Bookkeeper client in _that Segment 
Store has been attempting to read from the old IP of Bookie 0_. During all this 
period of time, the Segment Store was getting `BKReadException`s while trying 
to read from Bookie 0, but it looks like this is not enough to re-resolve the 
Bookie name:
   ```
   2020-10-23 05:57:08,475 36509025 [core-14] WARN  
i.p.s.s.c.StreamSegmentContainer - SegmentContainer[37]: DurableLog failed. 
Shutting down StreamSegmentContainer.
   io.pravega.segmentstore.storage.DurableDataLogException: Unable to open-read 
ledger 11009.
        at 
io.pravega.segmentstore.storage.impl.bookkeeper.Ledgers.openRead(Ledgers.java:109)
   ...
   Caused by: org.apache.bookkeeper.client.BKException$BKReadException: Error 
while reading ledger
   ``` 
   Apparently, the correct behavior seems to be to trigger 
`PerChannelBookieClient:channelInactive` as soon as a Bookie crashes, but 
according to our experiments this does not happens always. As an hypothesis, 
the fact that we consistently get a delay of 15 minutes to close the Bookie 
channel is surprising, and it might indicate that there could be other 
underlying timeouts into play, which prevent the connection to get closed and 
`channelInactive` from being invoked, but I do not have proof of that. 
   
   
   
   ***To Reproduce***
   
   Steps to reproduce the behavior:
   1. Deploy Pravega (https://pravega.io/) in a Kubernetes cluster (e.g. 
OpenShift, Kubespray). Our reproductions of the issue use 6 Segment Stores and 
3 Bookies (writeQuorum=ensembleQuorum=3).
   2. Once the system is up and running, shut down any of the Kubernetes nodes 
running a Bookie.
   3. In this situation, we have been able to reproduce the behavior in which 1 
Segment Store (out of 6) keeps trying to contact the restarted Bookie with the 
old IP. This situation keeps like this for over 15 minutes after the Bookie was 
restarted, until the Bookkeeper client on that particular Segment Store closes 
the channel. All other Segment Stores (and the other Bookies in the system) are 
able to detect the failed Bookie as soon as it is restarted and reconnect to it 
when it comes up again.
   
   The configuration of our Bookkeeper client can be found here: 
https://github.com/pravega/pravega/blob/7e8a26133281a86ca8823c4665fd7d2fb836eeb1/segmentstore/storage/impl/src/main/java/io/pravega/segmentstore/storage/impl/bookkeeper/BookKeeperLogFactory.java#L138
   
   ***Expected behavior***
   
   All Bookkeeper clients should be able to detect that a Bookie in the 
ensemble is not responsive and reset the channel, so the IP of the Bookie can 
be resolved again in the case it has changed.
   
   ***Screenshots***
   
   N/A
   
   ***Additional context***
   
   According to https://github.com/apache/bookkeeper/pull/1762, we have 
experimented by setting `networkaddress.cache.ttl` without success.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to