[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-11 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1260232815


##
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupMetadataManagerTest.java:
##
@@ -2798,8 +2798,6 @@ public void testRevocationTimeoutLifecycle() {
 .setGroupId(groupId)
 .setMemberId(memberId1)
 .setMemberEpoch(1)
-.setRebalanceTimeoutMs(9)

Review Comment:
   why did we remove this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-11 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1260035984


##
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupMetadataManagerTest.java:
##
@@ -2402,6 +2448,584 @@ public void testOnNewMetadataImage() {
 assertEquals(image, context.groupMetadataManager.image());
 }
 
+@Test
+public void testSessionTimeoutLifecycle() {
+String groupId = "fooup";
+// Use a static member id as it makes the test easier.
+String memberId = Uuid.randomUuid().toString();
+
+Uuid fooTopicId = Uuid.randomUuid();
+String fooTopicName = "foo";
+
+MockPartitionAssignor assignor = new MockPartitionAssignor("range");
+GroupMetadataManagerTestContext context = new 
GroupMetadataManagerTestContext.Builder()
+.withAssignors(Collections.singletonList(assignor))
+.withMetadataImage(new MetadataImageBuilder()
+.addTopic(fooTopicId, fooTopicName, 6)
+.build())
+.build();
+
+assignor.prepareGroupAssignment(new GroupAssignment(
+Collections.singletonMap(memberId, new 
MemberAssignment(mkAssignment(
+mkTopicAssignment(fooTopicId, 0, 1, 2, 3, 4, 5)
+)))
+));
+
+// Session timer is scheduled on first heartbeat.
+CoordinatorResult result =
+context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(0)
+.setRebalanceTimeoutMs(9)
+.setSubscribedTopicNames(Collections.singletonList("foo"))
+.setTopicPartitions(Collections.emptyList()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, memberId, 45000);
+
+// Advance time.
+assertEquals(
+Collections.emptyList(),
+context.sleep(result.response().heartbeatIntervalMs())
+);
+
+// Session timer is rescheduled on second heartbeat.
+result = context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(result.response().memberEpoch()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, memberId, 45000);
+
+// Advance time.
+assertEquals(
+Collections.emptyList(),
+context.sleep(result.response().heartbeatIntervalMs())
+);
+
+// Session timer is cancelled on leave.
+result = context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(-1));
+assertEquals(-1, result.response().memberEpoch());
+
+// Verify that there are no timers.
+context.assertNoSessionTimeout(groupId, memberId);
+context.assertNoRevocationTimeout(groupId, memberId);
+}
+
+@Test
+public void testSessionTimeoutExpiration() {
+String groupId = "fooup";
+// Use a static member id as it makes the test easier.
+String memberId = Uuid.randomUuid().toString();
+
+Uuid fooTopicId = Uuid.randomUuid();
+String fooTopicName = "foo";
+
+MockPartitionAssignor assignor = new MockPartitionAssignor("range");
+GroupMetadataManagerTestContext context = new 
GroupMetadataManagerTestContext.Builder()
+.withAssignors(Collections.singletonList(assignor))
+.withMetadataImage(new MetadataImageBuilder()
+.addTopic(fooTopicId, fooTopicName, 6)
+.build())
+.build();
+
+assignor.prepareGroupAssignment(new GroupAssignment(
+Collections.singletonMap(memberId, new 
MemberAssignment(mkAssignment(
+mkTopicAssignment(fooTopicId, 0, 1, 2, 3, 4, 5)
+)))
+));
+
+// Session timer is scheduled on first heartbeat.
+CoordinatorResult result =
+context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(0)
+.setRebalanceTimeoutMs(9)
+.setSubscribedTopicNames(Collections.singletonList("foo"))
+.setTopicPartitions(Collections.emptyList()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, 

[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-11 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1259983786


##
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/GroupMetadataManagerTest.java:
##
@@ -2402,6 +2448,584 @@ public void testOnNewMetadataImage() {
 assertEquals(image, context.groupMetadataManager.image());
 }
 
+@Test
+public void testSessionTimeoutLifecycle() {
+String groupId = "fooup";
+// Use a static member id as it makes the test easier.
+String memberId = Uuid.randomUuid().toString();
+
+Uuid fooTopicId = Uuid.randomUuid();
+String fooTopicName = "foo";
+
+MockPartitionAssignor assignor = new MockPartitionAssignor("range");
+GroupMetadataManagerTestContext context = new 
GroupMetadataManagerTestContext.Builder()
+.withAssignors(Collections.singletonList(assignor))
+.withMetadataImage(new MetadataImageBuilder()
+.addTopic(fooTopicId, fooTopicName, 6)
+.build())
+.build();
+
+assignor.prepareGroupAssignment(new GroupAssignment(
+Collections.singletonMap(memberId, new 
MemberAssignment(mkAssignment(
+mkTopicAssignment(fooTopicId, 0, 1, 2, 3, 4, 5)
+)))
+));
+
+// Session timer is scheduled on first heartbeat.
+CoordinatorResult result =
+context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(0)
+.setRebalanceTimeoutMs(9)
+.setSubscribedTopicNames(Collections.singletonList("foo"))
+.setTopicPartitions(Collections.emptyList()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, memberId, 45000);
+
+// Advance time.
+assertEquals(
+Collections.emptyList(),
+context.sleep(result.response().heartbeatIntervalMs())
+);
+
+// Session timer is rescheduled on second heartbeat.
+result = context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(result.response().memberEpoch()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, memberId, 45000);
+
+// Advance time.
+assertEquals(
+Collections.emptyList(),
+context.sleep(result.response().heartbeatIntervalMs())
+);
+
+// Session timer is cancelled on leave.
+result = context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(-1));
+assertEquals(-1, result.response().memberEpoch());
+
+// Verify that there are no timers.
+context.assertNoSessionTimeout(groupId, memberId);
+context.assertNoRevocationTimeout(groupId, memberId);
+}
+
+@Test
+public void testSessionTimeoutExpiration() {
+String groupId = "fooup";
+// Use a static member id as it makes the test easier.
+String memberId = Uuid.randomUuid().toString();
+
+Uuid fooTopicId = Uuid.randomUuid();
+String fooTopicName = "foo";
+
+MockPartitionAssignor assignor = new MockPartitionAssignor("range");
+GroupMetadataManagerTestContext context = new 
GroupMetadataManagerTestContext.Builder()
+.withAssignors(Collections.singletonList(assignor))
+.withMetadataImage(new MetadataImageBuilder()
+.addTopic(fooTopicId, fooTopicName, 6)
+.build())
+.build();
+
+assignor.prepareGroupAssignment(new GroupAssignment(
+Collections.singletonMap(memberId, new 
MemberAssignment(mkAssignment(
+mkTopicAssignment(fooTopicId, 0, 1, 2, 3, 4, 5)
+)))
+));
+
+// Session timer is scheduled on first heartbeat.
+CoordinatorResult result =
+context.consumerGroupHeartbeat(
+new ConsumerGroupHeartbeatRequestData()
+.setGroupId(groupId)
+.setMemberId(memberId)
+.setMemberEpoch(0)
+.setRebalanceTimeoutMs(9)
+.setSubscribedTopicNames(Collections.singletonList("foo"))
+.setTopicPartitions(Collections.emptyList()));
+assertEquals(1, result.response().memberEpoch());
+
+// Verify that there is a session time.
+context.assertSessionTimeout(groupId, 

[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258697658


##
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/MockCoordinatorTimer.java:
##
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.coordinator.group;
+
+import org.apache.kafka.common.utils.Time;
+import org.apache.kafka.coordinator.group.runtime.CoordinatorTimer;
+
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.PriorityQueue;
+import java.util.concurrent.TimeUnit;
+
+public class MockCoordinatorTimer implements CoordinatorTimer {
+public static class ScheduledTimeout {
+public final String key;
+public final long deadlineMs;
+public final TimeoutOperation operation;
+
+ScheduledTimeout(
+String key,
+long deadlineMs,
+TimeoutOperation operation
+) {
+this.key = key;
+this.deadlineMs = deadlineMs;
+this.operation = operation;
+}
+}
+
+public static class ExpiredTimeout {
+public final String key;
+public final List records;
+
+ExpiredTimeout(
+String key,
+List records
+) {
+this.key = key;
+this.records = records;
+}
+
+@Override
+public boolean equals(Object o) {
+if (this == o) return true;
+if (o == null || getClass() != o.getClass()) return false;
+
+ExpiredTimeout that = (ExpiredTimeout) o;
+
+if (!Objects.equals(key, that.key)) return false;
+return Objects.equals(records, that.records);
+}
+
+@Override
+public int hashCode() {
+int result = key != null ? key.hashCode() : 0;
+result = 31 * result + (records != null ? records.hashCode() : 0);
+return result;
+}
+}
+
+private final Time time;
+
+private final Map> timeoutMap = new 
HashMap<>();
+private final PriorityQueue> timeoutQueue = new 
PriorityQueue<>(
+Comparator.comparingLong(entry -> entry.deadlineMs)
+);
+
+public MockCoordinatorTimer(Time time) {
+this.time = time;
+}
+
+@Override
+public void schedule(
+String key,
+long delay,
+TimeUnit unit,
+boolean retry,
+TimeoutOperation operation
+) {
+cancel(key);
+
+long deadlineMs = time.milliseconds() + unit.toMillis(delay);
+ScheduledTimeout timeout = new ScheduledTimeout<>(key, deadlineMs, 
operation);
+timeoutQueue.add(timeout);
+timeoutMap.put(key, timeout);
+}
+
+@Override
+public void cancel(String key) {
+ScheduledTimeout timeout = timeoutMap.remove(key);
+if (timeout != null) {
+timeoutQueue.remove(timeout);
+}
+}
+
+public boolean contains(String key) {
+return timeoutMap.containsKey(key);
+}
+
+public ScheduledTimeout timeout(String key) {
+return timeoutMap.get(key);
+}
+
+public int size() {
+return timeoutMap.size();
+}
+
+public List> poll() {

Review Comment:
   So we don't expire timeouts until we call this poll? Also in general, can we 
add a few more comments about this class and it usage?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258697658


##
group-coordinator/src/test/java/org/apache/kafka/coordinator/group/MockCoordinatorTimer.java:
##
@@ -0,0 +1,144 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.coordinator.group;
+
+import org.apache.kafka.common.utils.Time;
+import org.apache.kafka.coordinator.group.runtime.CoordinatorTimer;
+
+import java.util.ArrayList;
+import java.util.Comparator;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import java.util.Objects;
+import java.util.PriorityQueue;
+import java.util.concurrent.TimeUnit;
+
+public class MockCoordinatorTimer implements CoordinatorTimer {
+public static class ScheduledTimeout {
+public final String key;
+public final long deadlineMs;
+public final TimeoutOperation operation;
+
+ScheduledTimeout(
+String key,
+long deadlineMs,
+TimeoutOperation operation
+) {
+this.key = key;
+this.deadlineMs = deadlineMs;
+this.operation = operation;
+}
+}
+
+public static class ExpiredTimeout {
+public final String key;
+public final List records;
+
+ExpiredTimeout(
+String key,
+List records
+) {
+this.key = key;
+this.records = records;
+}
+
+@Override
+public boolean equals(Object o) {
+if (this == o) return true;
+if (o == null || getClass() != o.getClass()) return false;
+
+ExpiredTimeout that = (ExpiredTimeout) o;
+
+if (!Objects.equals(key, that.key)) return false;
+return Objects.equals(records, that.records);
+}
+
+@Override
+public int hashCode() {
+int result = key != null ? key.hashCode() : 0;
+result = 31 * result + (records != null ? records.hashCode() : 0);
+return result;
+}
+}
+
+private final Time time;
+
+private final Map> timeoutMap = new 
HashMap<>();
+private final PriorityQueue> timeoutQueue = new 
PriorityQueue<>(
+Comparator.comparingLong(entry -> entry.deadlineMs)
+);
+
+public MockCoordinatorTimer(Time time) {
+this.time = time;
+}
+
+@Override
+public void schedule(
+String key,
+long delay,
+TimeUnit unit,
+boolean retry,
+TimeoutOperation operation
+) {
+cancel(key);
+
+long deadlineMs = time.milliseconds() + unit.toMillis(delay);
+ScheduledTimeout timeout = new ScheduledTimeout<>(key, deadlineMs, 
operation);
+timeoutQueue.add(timeout);
+timeoutMap.put(key, timeout);
+}
+
+@Override
+public void cancel(String key) {
+ScheduledTimeout timeout = timeoutMap.remove(key);
+if (timeout != null) {
+timeoutQueue.remove(timeout);
+}
+}
+
+public boolean contains(String key) {
+return timeoutMap.containsKey(key);
+}
+
+public ScheduledTimeout timeout(String key) {
+return timeoutMap.get(key);
+}
+
+public int size() {
+return timeoutMap.size();
+}
+
+public List> poll() {

Review Comment:
   So we don't expire timeouts until we call this poll?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258683353


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the group " +
+"does not exist.");
+} catch (UnknownMemberIdException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the member " +
+"does not exist.");
+}
+
+return Collections.emptyList();
+});
+}
+
+/**
+ * Cancels the session timeout of the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void cancelConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+timer.cancel(consumerGroupSessionTimeoutKey(groupId, memberId));
+}
+
+/**
+ * Schedules a revocation timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ * @param revocationTimeoutMs   The revocation timeout.
+ * @param expectedMemberEpoch   The expected member epoch.
+ */
+private void scheduleConsumerGroupRevocationTimeout(
+String groupId,
+String memberId,
+long revocationTimeoutMs,
+int expectedMemberEpoch
+) {
+String key = consumerGroupRevocationTimeoutKey(groupId, memberId);
+timer.schedule(key, revocationTimeoutMs, TimeUnit.MILLISECONDS, true, 
() -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+if (member.state() != ConsumerGroupMember.MemberState.REVOKING 
&&
+member.memberEpoch() != expectedMemberEpoch) {

Review Comment:
   Do we have a test for this case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258681834


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {

Review Comment:
   Oh I see. We only catch the exceptions in the other PR if they are thrown 
(not caught) here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258653960


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the group " +
+"does not exist.");
+} catch (UnknownMemberIdException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the member " +
+"does not exist.");
+}
+
+return Collections.emptyList();
+});
+}
+
+/**
+ * Cancels the session timeout of the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void cancelConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+timer.cancel(consumerGroupSessionTimeoutKey(groupId, memberId));
+}
+
+/**
+ * Schedules a revocation timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ * @param revocationTimeoutMs   The revocation timeout.
+ * @param expectedMemberEpoch   The expected member epoch.
+ */
+private void scheduleConsumerGroupRevocationTimeout(
+String groupId,
+String memberId,
+long revocationTimeoutMs,
+int expectedMemberEpoch
+) {
+String key = consumerGroupRevocationTimeoutKey(groupId, memberId);
+timer.schedule(key, revocationTimeoutMs, TimeUnit.MILLISECONDS, true, 
() -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+if (member.state() != ConsumerGroupMember.MemberState.REVOKING 
&&
+member.memberEpoch() != expectedMemberEpoch) {

Review Comment:
   We continue to revoke even if the epoch is unexpected?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258653960


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the group " +
+"does not exist.");
+} catch (UnknownMemberIdException ex) {
+log.debug("[GroupId " + groupId + "] Could not fence " + 
memberId + " because the member " +
+"does not exist.");
+}
+
+return Collections.emptyList();
+});
+}
+
+/**
+ * Cancels the session timeout of the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void cancelConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+timer.cancel(consumerGroupSessionTimeoutKey(groupId, memberId));
+}
+
+/**
+ * Schedules a revocation timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ * @param revocationTimeoutMs   The revocation timeout.
+ * @param expectedMemberEpoch   The expected member epoch.
+ */
+private void scheduleConsumerGroupRevocationTimeout(
+String groupId,
+String memberId,
+long revocationTimeoutMs,
+int expectedMemberEpoch
+) {
+String key = consumerGroupRevocationTimeoutKey(groupId, memberId);
+timer.schedule(key, revocationTimeoutMs, TimeUnit.MILLISECONDS, true, 
() -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+if (member.state() != ConsumerGroupMember.MemberState.REVOKING 
&&
+member.memberEpoch() != expectedMemberEpoch) {

Review Comment:
   We continue to revoke even if the epoch is unexpected? Also do we want to 
return early here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258650248


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -700,17 +726,36 @@ private 
CoordinatorResult consumerGr
 String groupId,
 String memberId
 ) throws ApiException {
-List records = new ArrayList<>();
-
 ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, false);
 ConsumerGroupMember member = group.getOrMaybeCreateMember(memberId, 
false);
 
 log.info("[GroupId " + groupId + "] Member " + memberId + " left the 
consumer group.");
 
+List records = consumerGroupFenceMember(group, member);
+cancelConsumerGroupSessionTimeout(groupId, memberId);

Review Comment:
   would we also want to cancel the revocation timeout?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258644974


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {

Review Comment:
   based on the previous pr, we retry these exceptions. I can imagine that some 
metadata was slow to update or something and eventually it could succeed. Do we 
have any path forward though if this request was issued and the group/member is 
no longer there? I guess this relying on canceling the task when we call group 
leave for the member? Do we have a method also to remove groups?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka] jolshan commented on a diff in pull request #13963: KAFKA-14462; [22/N] Implement session and revocation timeouts

2023-07-10 Thread via GitHub


jolshan commented on code in PR #13963:
URL: https://github.com/apache/kafka/pull/13963#discussion_r1258644974


##
group-coordinator/src/main/java/org/apache/kafka/coordinator/group/GroupMetadataManager.java:
##
@@ -720,19 +765,116 @@ private 
CoordinatorResult consumerGr
 );
 
 if (!subscriptionMetadata.equals(group.subscriptionMetadata())) {
-log.info("[GroupId " + groupId + "] Computed new subscription 
metadata: "
+log.info("[GroupId " + group.groupId() + "] Computed new 
subscription metadata: "
 + subscriptionMetadata + ".");
-records.add(newGroupSubscriptionMetadataRecord(groupId, 
subscriptionMetadata));
+records.add(newGroupSubscriptionMetadataRecord(group.groupId(), 
subscriptionMetadata));
 }
 
 // We bump the group epoch.
 int groupEpoch = group.groupEpoch() + 1;
-records.add(newGroupEpochRecord(groupId, groupEpoch));
+records.add(newGroupEpochRecord(group.groupId(), groupEpoch));
 
-return new CoordinatorResult<>(records, new 
ConsumerGroupHeartbeatResponseData()
-.setMemberId(memberId)
-.setMemberEpoch(-1)
-);
+return records;
+}
+
+/**
+ * Schedules (or reschedules) the session timeout for the member.
+ *
+ * @param groupId   The group id.
+ * @param memberId  The member id.
+ */
+private void scheduleConsumerGroupSessionTimeout(
+String groupId,
+String memberId
+) {
+String key = consumerGroupSessionTimeoutKey(groupId, memberId);
+timer.schedule(key, consumerGroupSessionTimeoutMs, 
TimeUnit.MILLISECONDS, true, () -> {
+try {
+ConsumerGroup group = getOrMaybeCreateConsumerGroup(groupId, 
false);
+ConsumerGroupMember member = 
group.getOrMaybeCreateMember(memberId, false);
+
+log.info("[GroupId " + groupId + "] Member " + memberId + " 
fenced from the group because " +
+"its session expired.");
+
+return consumerGroupFenceMember(group, member);
+} catch (GroupIdNotFoundException ex) {

Review Comment:
   based on the previous pr, we retry these exceptions. I can imagine that some 
metadata was slow to update or something and eventually it could succeed. Do we 
have any path forward though if this request was issued and the group/member is 
no longer there? I guess this relying on canceling the task when we call group 
leave for the member?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org