Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-11 Thread via GitHub


cmccabe commented on PR #15695:
URL: https://github.com/apache/kafka/pull/15695#issuecomment-2050092535

   > Should we replace all metadataCache.getControllerId by 
getCurrentControllerIdFromOldController if the former is unstable?
   
   The metadata cache is supposed to be used for returning metadata to clients. 
The two uses I see in `KafkaApis.scala` do seem to be in line with that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-11 Thread via GitHub


cmccabe merged PR #15695:
URL: https://github.com/apache/kafka/pull/15695


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-11 Thread via GitHub


chia7712 commented on PR #15695:
URL: https://github.com/apache/kafka/pull/15695#issuecomment-2049066956

   > This PR fixes that by using the controller ID from the 
KafkaController.scala, which is obtained directly from the controller znode. It 
also adds a new test, ControllerIdMetricTest.scala.
   
   Should we replace all `metadataCache.getControllerId` by 
`getCurrentControllerIdFromOldController` if the former is unstable?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-10 Thread via GitHub


mumrah commented on code in PR #15695:
URL: https://github.com/apache/kafka/pull/15695#discussion_r1560175751


##
core/src/main/scala/kafka/server/KafkaServer.scala:
##
@@ -657,15 +657,19 @@ class KafkaServer(
   }
 
   private def createCurrentControllerIdMetric(): Unit = {
-
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
 () => {
-  Option(metadataCache) match {
-case None => -1
-case Some(cache) => cache.getControllerId match {

Review Comment:
   Ah right, makes sense 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-10 Thread via GitHub


cmccabe commented on code in PR #15695:
URL: https://github.com/apache/kafka/pull/15695#discussion_r1560167894


##
core/src/main/scala/kafka/server/KafkaServer.scala:
##
@@ -657,15 +657,19 @@ class KafkaServer(
   }
 
   private def createCurrentControllerIdMetric(): Unit = {
-
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
 () => {
-  Option(metadataCache) match {
-case None => -1
-case Some(cache) => cache.getControllerId match {
-  case None => -1
-  case Some(id) => id.id
-}
-  }
-})
+
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
+  () => getCurrentControllerIdFromOldController())
+  }
+
+  /**
+   * Get the current controller ID from the old controller code.
+   * This is the most up-to-date controller ID we can get when in ZK mode.
+   */
+  def getCurrentControllerIdFromOldController(): Int = {
+Option(_kafkaController) match {
+  case None => -1

Review Comment:
   yes, that's right.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-10 Thread via GitHub


cmccabe commented on code in PR #15695:
URL: https://github.com/apache/kafka/pull/15695#discussion_r1560168052


##
core/src/main/scala/kafka/server/KafkaServer.scala:
##
@@ -657,15 +657,19 @@ class KafkaServer(
   }
 
   private def createCurrentControllerIdMetric(): Unit = {
-
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
 () => {
-  Option(metadataCache) match {
-case None => -1
-case Some(cache) => cache.getControllerId match {

Review Comment:
   It should be OK in hybrid mode since we are still updating the znode in 
hybrid mode. And this comes from there directly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-16509: CurrentControllerId metric is unreliable in ZK mode [kafka]

2024-04-10 Thread via GitHub


mumrah commented on code in PR #15695:
URL: https://github.com/apache/kafka/pull/15695#discussion_r1560161829


##
core/src/main/scala/kafka/server/KafkaServer.scala:
##
@@ -657,15 +657,19 @@ class KafkaServer(
   }
 
   private def createCurrentControllerIdMetric(): Unit = {
-
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
 () => {
-  Option(metadataCache) match {
-case None => -1
-case Some(cache) => cache.getControllerId match {
-  case None => -1
-  case Some(id) => id.id
-}
-  }
-})
+
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
+  () => getCurrentControllerIdFromOldController())
+  }
+
+  /**
+   * Get the current controller ID from the old controller code.
+   * This is the most up-to-date controller ID we can get when in ZK mode.
+   */
+  def getCurrentControllerIdFromOldController(): Int = {
+Option(_kafkaController) match {
+  case None => -1

Review Comment:
   Is this just to cover the startup case? (When `_kafkaController` is None)



##
core/src/main/scala/kafka/server/KafkaServer.scala:
##
@@ -657,15 +657,19 @@ class KafkaServer(
   }
 
   private def createCurrentControllerIdMetric(): Unit = {
-
KafkaYammerMetrics.defaultRegistry().newGauge(MetadataLoaderMetrics.CURRENT_CONTROLLER_ID,
 () => {
-  Option(metadataCache) match {
-case None => -1
-case Some(cache) => cache.getControllerId match {

Review Comment:
   Will this work in hybrid mode? Don't we still need to read from the metadata 
cache to get the KRaft controller ID once we've entered hybrid mode?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org