Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-09 Thread via GitHub


chia7712 merged PR #12174:
URL: https://github.com/apache/kafka/pull/12174


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-09 Thread via GitHub


chia7712 commented on PR #12174:
URL: https://github.com/apache/kafka/pull/12174#issuecomment-2044780100

   @soarez I believe those failed tests are unrelated, but could you rebase 
code to trigger QA again due to incompletion of `JDK 17 and Scala 2.13`? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-09 Thread via GitHub


soarez commented on PR #12174:
URL: https://github.com/apache/kafka/pull/12174#issuecomment-2044731959

   @chia7712 done. It seems to me that the current test failures are unrelated. 
Could you confirm?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-07 Thread via GitHub


chia7712 commented on PR #12174:
URL: https://github.com/apache/kafka/pull/12174#issuecomment-2041478299

   @soarez Could you please fix the conflicts? thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-06 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1554541084


##
core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala:
##
@@ -260,9 +274,12 @@ abstract class KafkaServerTestHarness extends 
QuorumTestHarness {
   }
 
   def killBroker(index: Int): Unit = {

Review Comment:
   Oh I see. It was a mistake to remove `awaitShutdown`. I think we should 
leave the behavior as is, I don't see any reason why the new `killBroker(idx, 
timeout)` alternative shouldn't also call `awaitShutdown`. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-04 Thread via GitHub


chia7712 commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1551687551


##
core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala:
##
@@ -260,9 +274,12 @@ abstract class KafkaServerTestHarness extends 
QuorumTestHarness {
   }
 
   def killBroker(index: Int): Unit = {

Review Comment:
   My point was "should we keep `_brokers(index).awaitShutdown()`"? It seems 
the latch used by `awaitShutdown` should be notified after `shutdown` get 
completed, so maybe it is fine to remove it. It would be great to add comments 
if you prefer to remove `awaitShutdown`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-04 Thread via GitHub


chia7712 commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1551371510


##
core/src/test/scala/unit/kafka/server/ServerShutdownTest.scala:
##
@@ -197,12 +198,11 @@ class ServerShutdownTest extends KafkaServerTestHarness {
 verifyNonDaemonThreadsStatus()
   }
 
-  @Disabled
   @ParameterizedTest(name = TestInfoUtils.TestWithParameterizedQuorumName)
   @ValueSource(strings = Array("kraft"))
   def testCleanShutdownWithKRaftControllerUnavailable(quorum: String): Unit = {

Review Comment:
   That is good to me



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-04 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1551132979


##
core/src/test/java/kafka/test/junit/ZkClusterInvocationContext.java:
##
@@ -323,7 +324,7 @@ public void rollingBrokerRestart() {
 throw new IllegalStateException("Tried to restart brokers but 
the cluster has not been started!");
 }
 for (int i = 0; i < clusterReference.get().brokerCount(); i++) {
-clusterReference.get().killBroker(i);
+clusterReference.get().killBroker(i, Duration.ofSeconds(5));

Review Comment:
   I think this is unnecessary. Removing



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-04 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1551132385


##
core/src/test/scala/unit/kafka/server/ServerShutdownTest.scala:
##
@@ -197,12 +198,11 @@ class ServerShutdownTest extends KafkaServerTestHarness {
 verifyNonDaemonThreadsStatus()
   }
 
-  @Disabled
   @ParameterizedTest(name = TestInfoUtils.TestWithParameterizedQuorumName)
   @ValueSource(strings = Array("kraft"))
   def testCleanShutdownWithKRaftControllerUnavailable(quorum: String): Unit = {

Review Comment:
   That's a good point. If the controller is unavailable, the shutdown can't 
really be considered clean. However, I think it's better to just remove the 
`Clean` word from the method, rather than replacing it with `Dirty`. It seems 
cleaner and less confusing, as it might not be clear what "dirty" is referring 
to. Please let me know if you disagree.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-04 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1551129021


##
core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala:
##
@@ -260,9 +274,12 @@ abstract class KafkaServerTestHarness extends 
QuorumTestHarness {
   }
 
   def killBroker(index: Int): Unit = {

Review Comment:
   The original implementation has a timeout of 5 minutes, so I'll keep the 
same behavior here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-03 Thread via GitHub


chia7712 commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1549916567


##
core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala:
##
@@ -260,9 +274,12 @@ abstract class KafkaServerTestHarness extends 
QuorumTestHarness {
   }
 
   def killBroker(index: Int): Unit = {

Review Comment:
   maybe we should keep origin implementation since it expect to await shutdown.



##
core/src/test/java/kafka/test/junit/ZkClusterInvocationContext.java:
##
@@ -323,7 +324,7 @@ public void rollingBrokerRestart() {
 throw new IllegalStateException("Tried to restart brokers but 
the cluster has not been started!");
 }
 for (int i = 0; i < clusterReference.get().brokerCount(); i++) {
-clusterReference.get().killBroker(i);
+clusterReference.get().killBroker(i, Duration.ofSeconds(5));

Review Comment:
   what is the purpose of this change?



##
core/src/test/scala/unit/kafka/server/ServerShutdownTest.scala:
##
@@ -197,12 +198,11 @@ class ServerShutdownTest extends KafkaServerTestHarness {
 verifyNonDaemonThreadsStatus()
   }
 
-  @Disabled
   @ParameterizedTest(name = TestInfoUtils.TestWithParameterizedQuorumName)
   @ValueSource(strings = Array("kraft"))
   def testCleanShutdownWithKRaftControllerUnavailable(quorum: String): Unit = {

Review Comment:
   the shutdown with timeout is a kind of dirty shutdown so we should rename 
the test to `testDirtyShutdownWithKRaftControllerUnavailable`



##
core/src/test/scala/unit/kafka/integration/KafkaServerTestHarness.scala:
##
@@ -260,9 +274,12 @@ abstract class KafkaServerTestHarness extends 
QuorumTestHarness {
   }
 
   def killBroker(index: Int): Unit = {
-if (alive(index)) {
-  _brokers(index).shutdown()
-  _brokers(index).awaitShutdown()
+killBroker(index, Duration.ofSeconds(5))
+  }
+
+  def killBroker(index: Int, timeout: Duration): Unit = {

Review Comment:
   we need to document the difference of this variety.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-03 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1549909706


##
core/src/main/scala/kafka/server/BrokerServer.scala:
##
@@ -623,9 +623,12 @@ class BrokerServer(
 }
   }
 
-  override def shutdown(): Unit = {
+  override def shutdown(): Unit = shutdown(TimeUnit.MINUTES.toMillis(5))

Review Comment:
   Yup, that makes sense. Changed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-03 Thread via GitHub


soarez commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1549909351


##
core/src/main/scala/kafka/server/KafkaBroker.scala:
##
@@ -93,6 +93,7 @@ trait KafkaBroker extends Logging {
   def startup(): Unit
   def awaitShutdown(): Unit
   def shutdown(): Unit
+  def shutdown(timeoutMs: Long): Unit

Review Comment:
   Good suggestion. Applied



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-03 Thread via GitHub


chia7712 commented on code in PR #12174:
URL: https://github.com/apache/kafka/pull/12174#discussion_r1549747480


##
core/src/main/scala/kafka/server/KafkaBroker.scala:
##
@@ -93,6 +93,7 @@ trait KafkaBroker extends Logging {
   def startup(): Unit
   def awaitShutdown(): Unit
   def shutdown(): Unit
+  def shutdown(timeoutMs: Long): Unit

Review Comment:
   How about using `Duration` instead of long type? Also, we can rename it from 
`timeoutMs` to `timeout`



##
core/src/main/scala/kafka/server/BrokerServer.scala:
##
@@ -623,9 +623,12 @@ class BrokerServer(
 }
   }
 
-  override def shutdown(): Unit = {
+  override def shutdown(): Unit = shutdown(TimeUnit.MINUTES.toMillis(5))

Review Comment:
   How about adding default implementation to parent class?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2024-04-03 Thread via GitHub


soarez opened a new pull request, #12174:
URL: https://github.com/apache/kafka/pull/12174

   When a controlled shutdown is requested the broker tries to communicate the 
state change to the controller via a heartbeat request. [1]
   
   In this test, the controller is not available so the request will fail. The 
current timeout behavior in a heartbeat request is to just keep retrying — 
which generally makes sense, just not in the context of a controlled shutdown.
   
   When a heartbeat request times out, if we are in the middle of a controlled 
shutdown, we shouldn't just retry forever but rather just give up on trying to 
contact the controller and proceed with the controlled shutdown.
   
   [1] 
https://github.com/apache/kafka/blob/f2d6282668a31b9a554563338f9178e2bba2833f/core/src/main/scala/kafka/server/BrokerLifecycleManager.scala#L217
   
   *Summary of testing strategy*
   The test no longer fails
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [PR] KAFKA-13907: Fix hanging ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable [kafka]

2023-11-06 Thread via GitHub


soarez closed pull request #12174: KAFKA-13907: Fix hanging 
ServerShutdownTest.testCleanShutdownWithKRaftControllerUnavailable
URL: https://github.com/apache/kafka/pull/12174


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org