Re: [PR] KAFKA-15950: Serialize broker heartbeat requests [kafka]

via GitHub Mon, 04 Dec 2023 10:29:06 -0800


junrao commented on code in PR #14903:
URL: https://github.com/apache/kafka/pull/14903#discussion_r1414322506



##########
core/src/test/scala/unit/kafka/server/BrokerLifecycleManagerTest.scala:
##########
@@ -197,11 +197,14 @@ class BrokerLifecycleManagerTest {
     result
   }
 
-  def poll[T](context: RegistrationTestContext, manager: 
BrokerLifecycleManager, future: Future[T]): T = {
-    while (!future.isDone || context.mockClient.hasInFlightRequests) {
-      context.poll()
+  def poll[T](ctx: RegistrationTestContext, manager: BrokerLifecycleManager, 
future: Future[T]): T = {
+    while (ctx.mockChannelManager.unsentQueue.isEmpty) {
+      if (manager.eventQueue.isEmpty)

Review Comment:
   Why do we need this check? If the eventQueue has a event scheduled event at 
a future time, we need to advance the time to drain the event, right?



##########
core/src/main/scala/kafka/server/BrokerLifecycleManager.scala:
##########
@@ -166,6 +166,19 @@ class BrokerLifecycleManager(
    */
   private var registered = false
 
+  /**
+   * True if a request has been sent and a response or timeout has not yet 
been processed.

Review Comment:
   a request => a Heartbeat request ?



##########
core/src/main/scala/kafka/server/BrokerLifecycleManager.scala:
##########
@@ -366,8 +379,27 @@ class BrokerLifecycleManager(
       new BrokerRegistrationResponseHandler())
   }
 
+  // the response handler is not invoked from the event handler thread,
+  // so it is not safe to update state here, instead, schedule an event
+  // to continue handling the response on the event handler thread
   private class BrokerRegistrationResponseHandler extends 
ControllerRequestCompletionHandler {
     override def onComplete(response: ClientResponse): Unit = {
+      eventQueue.prepend(new BrokerRegistrationResponseEvent(response, false))
+    }
+
+    override def onTimeout(): Unit = {
+      info(s"Unable to register the broker because the RPC got timed out 
before it could be sent.")
+      eventQueue.prepend(new BrokerRegistrationResponseEvent(null, true))
+    }
+  }
+
+  private class BrokerRegistrationResponseEvent(response: ClientResponse, 
timedOut: Boolean) extends EventQueue.Event {
+    override def run(): Unit = {
+      communicationInFlight = false

Review Comment:
   It seems that we don't need this since `communicationInFlight` is only set 
to true when sending a `HeartbeatRrequest`?



##########
core/src/main/scala/kafka/server/BrokerLifecycleManager.scala:
##########
@@ -453,79 +490,73 @@ class BrokerLifecycleManager(
         val message = 
response.responseBody().asInstanceOf[BrokerHeartbeatResponse]
         val errorCode = Errors.forCode(message.data().errorCode())
         if (errorCode == Errors.NONE) {
-          // this response handler is not invoked from the event handler 
thread,
-          // and processing a successful heartbeat response requires updating
-          // state, so to continue we need to schedule an event
-          eventQueue.prepend(new BrokerHeartbeatResponseEvent(message.data()))
+          val responseData = message.data()
+          failedAttempts = 0
+          _state match {
+            case BrokerState.STARTING =>
+              if (responseData.isCaughtUp) {
+                info(s"The broker has caught up. Transitioning from STARTING 
to RECOVERY.")
+                _state = BrokerState.RECOVERY
+                initialCatchUpFuture.complete(null)
+              } else {
+                debug(s"The broker is STARTING. Still waiting to catch up with 
cluster metadata.")
+              }
+              // Schedule the heartbeat after only 10 ms so that in the case 
where
+              // there is no recovery work to be done, we start up a bit 
quicker.
+              scheduleNextCommunication(NANOSECONDS.convert(10, MILLISECONDS))
+            case BrokerState.RECOVERY =>
+              if (!responseData.isFenced) {
+                info(s"The broker has been unfenced. Transitioning from 
RECOVERY to RUNNING.")
+                initialUnfenceFuture.complete(null)
+                _state = BrokerState.RUNNING
+              } else {
+                info(s"The broker is in RECOVERY.")
+              }
+              scheduleNextCommunicationAfterSuccess()
+            case BrokerState.RUNNING =>
+              debug(s"The broker is RUNNING. Processing heartbeat response.")
+              scheduleNextCommunicationAfterSuccess()
+            case BrokerState.PENDING_CONTROLLED_SHUTDOWN =>
+              if (!responseData.shouldShutDown()) {
+                info(s"The broker is in PENDING_CONTROLLED_SHUTDOWN state, 
still waiting " +
+                  "for the active controller.")
+                if (!gotControlledShutdownResponse) {
+                  // If this is the first pending controlled shutdown response 
we got,
+                  // schedule our next heartbeat a little bit sooner than we 
usually would.
+                  // In the case where controlled shutdown completes quickly, 
this will
+                  // speed things up a little bit.
+                  scheduleNextCommunication(NANOSECONDS.convert(50, 
MILLISECONDS))
+                } else {
+                  scheduleNextCommunicationAfterSuccess()
+                }
+              } else {
+                info(s"The controller has asked us to exit controlled 
shutdown.")
+                beginShutdown()
+              }
+              gotControlledShutdownResponse = true
+            case BrokerState.SHUTTING_DOWN =>
+              info(s"The broker is SHUTTING_DOWN. Ignoring heartbeat 
response.")
+            case _ =>
+              error(s"Unexpected broker state ${_state}")
+              scheduleNextCommunicationAfterSuccess()
+          }
         } else {
           warn(s"Broker $nodeId sent a heartbeat request but received error 
$errorCode.")
           scheduleNextCommunicationAfterFailure()
         }
       }
     }
-
-    override def onTimeout(): Unit = {
-      info("Unable to send a heartbeat because the RPC got timed out before it 
could be sent.")
-      scheduleNextCommunicationAfterFailure()
-    }
   }
 
-  private class BrokerHeartbeatResponseEvent(response: 
BrokerHeartbeatResponseData) extends EventQueue.Event {
-    override def run(): Unit = {
-      failedAttempts = 0
-      _state match {
-        case BrokerState.STARTING =>
-          if (response.isCaughtUp) {
-            info(s"The broker has caught up. Transitioning from STARTING to 
RECOVERY.")
-            _state = BrokerState.RECOVERY
-            initialCatchUpFuture.complete(null)
-          } else {
-            debug(s"The broker is STARTING. Still waiting to catch up with 
cluster metadata.")
-          }
-          // Schedule the heartbeat after only 10 ms so that in the case where
-          // there is no recovery work to be done, we start up a bit quicker.
-          scheduleNextCommunication(NANOSECONDS.convert(10, MILLISECONDS))
-        case BrokerState.RECOVERY =>
-          if (!response.isFenced) {
-            info(s"The broker has been unfenced. Transitioning from RECOVERY 
to RUNNING.")
-            initialUnfenceFuture.complete(null)
-            _state = BrokerState.RUNNING
-          } else {
-            info(s"The broker is in RECOVERY.")
-          }
-          scheduleNextCommunicationAfterSuccess()
-        case BrokerState.RUNNING =>
-          debug(s"The broker is RUNNING. Processing heartbeat response.")
-          scheduleNextCommunicationAfterSuccess()
-        case BrokerState.PENDING_CONTROLLED_SHUTDOWN =>
-          if (!response.shouldShutDown()) {
-            info(s"The broker is in PENDING_CONTROLLED_SHUTDOWN state, still 
waiting " +
-              "for the active controller.")
-            if (!gotControlledShutdownResponse) {
-              // If this is the first pending controlled shutdown response we 
got,
-              // schedule our next heartbeat a little bit sooner than we 
usually would.
-              // In the case where controlled shutdown completes quickly, this 
will
-              // speed things up a little bit.
-              scheduleNextCommunication(NANOSECONDS.convert(50, MILLISECONDS))
-            } else {
-              scheduleNextCommunicationAfterSuccess()
-            }
-          } else {
-            info(s"The controller has asked us to exit controlled shutdown.")
-            beginShutdown()
-          }
-          gotControlledShutdownResponse = true
-        case BrokerState.SHUTTING_DOWN =>
-          info(s"The broker is SHUTTING_DOWN. Ignoring heartbeat response.")
-        case _ =>
-          error(s"Unexpected broker state ${_state}")
-          scheduleNextCommunicationAfterSuccess()
-      }
+  private def scheduleNextCommunicationImmediately(): Unit = {
+    if (communicationInFlight) {

Review Comment:
   I am not sure if this completely avoids duplicated `HeartbeatRequest`s. 
Consider the following flow. The event queue contains a 
`BrokerHeartbeatResponseEvent` followed by an `OfflineDirEvent`. The event 
queue thread processes the former.  It sets `communicationInFlight` to false 
and enqueues a `CommunicationEvent`. It then process the `OfflineDirEvent`. 
Since `communicationInFlight` is false, it enqueues another 
`CommunicationEvent`. Now, there are two `CommunicationEvent`s and each will 
introduce a separate `HeartbeatRequest`.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] KAFKA-15950: Serialize broker heartbeat requests [kafka]

Reply via email to