Copilot commented on code in PR #818: URL: https://github.com/apache/kafka-site/pull/818#discussion_r2851979942
########## content/en/42/streams/developer-guide/kafka-streams-group-sh.md: ########## @@ -0,0 +1,183 @@ +--- +title: Kafka Streams Groups Tool +type: docs +description: +weight: 15 +tags: ['kafka', 'docs'] +aliases: +keywords: +--- + +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +Use `kafka-streams-groups.sh` to manage **Streams groups** for the Streams Rebalance Protocol (KIP‑1071): list and describe groups, inspect members and offsets/lag, reset or delete offsets for input topics, and delete groups (optionally including internal topics). + + +# Overview + +A **Streams group** is a broker‑coordinated group type for Kafka Streams that uses Streams‑specific RPCs and metadata, distinct from classic consumer groups. The CLI surfaces Streams‑specific states, assignments, and input‑topic offsets to simplify visibility and administration. + +**Use with care:** Mutating operations (offset resets/deletes, group deletion) affect how applications will reprocess data when restarted. Always preview with \--dry-run before executing and ensure application instances are stopped/inactive and the group is empty before executing the command. + +# What the Streams Groups tool does + + * **List Streams groups** across a cluster and display or filter by group state (Empty, Not Ready, Assigning, Reconciling, Stable, Dead). + * **Describe a Streams group** and show: + * Group state, group epoch, target assignment epoch (with `--state`, `--verbose` for additional details). + * Per‑member info such as epochs, current vs target assignments, and whether a member still uses the classic protocol (with `--members` and `--verbose`). + * Input‑topic offsets and lag (with `--offsets`), to understand how far behind processing is. + * **Reset input‑topic offsets** for a Streams group to control reprocessing boundaries using precise specifiers (earliest, latest, to‑offset, to‑datetime, by‑duration, shift‑by, from‑file). Requires `--dry-run` or `--execute` and inactive instances. + * **Delete offsets** for input topics to force re‑consumption on next start. + * **Delete a Streams group** to clean up broker‑side Streams metadata (offsets, topology, assignments). Optionally delete all, or a subset of, **internal topics** at the same time using `--internal-topics`. + + + +# Usage + +The script is located in `bin/kafka-streams-groups.sh` and connects to your cluster via `--bootstrap-server`. For secured clusters, pass AdminClient properties using `--command-config`. + + + $ kafka-streams-groups.sh --bootstrap-server <host:port> [COMMAND] [OPTIONS] + +**Note:** `kafka-streams-groups.sh` complements the Streams Admin API for Streams groups. The CLI exposes list/describe/delete operations and offset management similar in spirit to consumer-group tools, but tailored to Streams groups defined in KIP‑1071. + +# Commands + +## List Streams groups + +Discovering groups + + + # List all Streams groups + kafka-streams-groups.sh --bootstrap-server localhost:9092 --list + + +## Describe Streams groups + +Inspecting group's state, members, and lag + + + # Describe a group: state + epochs + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --describe --group my-streams-app --state --verbose + + # Describe a group: members (assignments vs target, classic/streams) + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --describe --group my-streams-app --members --verbose + + # Describe a group: input-topic offsets and lag + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --describe --group my-streams-app --offsets + + +## Reset input-topic offsets (preview, then apply) {#reset-offsets} + +Ensure all application instances are stopped/inactive. Always preview changes with `--dry-run` before using `--execute`. + + + # Preview resetting all input topics to a specific timestamp + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --group my-streams-app \ + --reset-offsets --all-input-topics --to-datetime 2025-01-31T23:57:00.000 \ + --dry-run + + # Apply the reset + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --group my-streams-app \ + --reset-offsets --all-input-topics --to-datetime 2025-01-31T23:57:00.000 \ + --execute + + +## Delete offsets to force re-consumption + +Delete offsets for all or specific input topics to have the group re-read data on restart. + + + # Delete offsets for all input topics (execute) + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --group my-streams-app \ + --delete-offsets --all-input-topics --execute + + # Delete offsets for specific topics + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --group my-streams-app \ + --delete-offsets --topic input-a --topic input-b --execute + + +## Delete a Streams group (cleanup) + +Delete broker-side Streams metadata for a group and optionally remove a subset of internal topics. + + + # Delete Streams group metadata + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --delete --group my-streams-app + + # Delete a subset of internal topics alongside the group (use with care) + kafka-streams-groups.sh --bootstrap-server localhost:9092 \ + --delete --group my-streams-app \ + --internal-topics my-app-repartition-0,my-app-changelog + + +# All options and flags + +## Core actions + + * `--list`: List Streams groups. Use `--state` to display/filter by state. + * `--describe`: Describe a group selected by `--group`. Combine with: + * `--state` (group state and epochs), `--members` (members and assignments), `--offsets` (input and repartition topics offsets/lag). + * `--verbose` for additional details (e.g., leader epochs where applicable). + * `--reset-offsets`: Reset input-topic offsets (one group at a time; instances should be inactive). Choose exactly one specifier: + * `--to-earliest`, `--to-latest`, `--to-current`, `--to-offset <n>` + * `--by-duration <PnDTnHnMnS>`, `--to-datetime <YYYY-MM-DDTHH:mm:SS.sss>` Review Comment: The `--to-datetime` format string uses `...HH:mm:SS.sss` but other Streams docs use seconds as lowercase `ss` (e.g., `YYYY-MM-DDThh:mm:ss.sss`). Using `SS` here is likely misleading for users; align the format string with the actual expected pattern. ```suggestion * `--by-duration <PnDTnHnMnS>`, `--to-datetime <YYYY-MM-DDTHH:mm:ss.sss>` ``` ########## content/en/42/getting-started/upgrade.md: ########## @@ -110,7 +110,10 @@ For further details, please refer to [KIP-1120](https://cwiki.apache.org/conflue * **Admin** * The `listConsumerGroups()` and `listConsumerGroups(ListConsumerGroupsOptions)` methods in `Admin` are deprecated, and will be removed in the next major version. Use `Admin.listGroups(ListGroupsOptions.forConsumerGroups())` instead. * **Kafka Streams** - * The `window.size.ms` and `window.inner.serde.class` in `StreamsConfig` are deprecated. Use the corresponding string constants defined in `TimeWindowedSerializer`, `TimeWindowedDeserializer`, `SessionWindowedSerializer` and `SessionWindowedDeserializer` instead. + * Early Access for the Streams rebalance protocol. Following KIP-848, [KIP-1071](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1071%3A+Streams+Rebalance+Protocol) adds broker side task assignment for Kafka Streams applications. + This feature is in Early Access, disabled by default, and not for production use-cases. + We encourage users to try out the new "streams" group feature in lower environments using "throw away" broker cluster, Review Comment: Wording/consistency: “broker side” should be hyphenated as “broker-side”, and “throw away” is typically written as “throw-away” or “throwaway” (also consider adding an article, e.g., “a throw-away broker cluster”). These small fixes improve readability in the upgrade guide. ```suggestion * Early Access for the Streams rebalance protocol. Following KIP-848, [KIP-1071](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1071%3A+Streams+Rebalance+Protocol) adds broker-side task assignment for Kafka Streams applications. This feature is in Early Access, disabled by default, and not for production use-cases. We encourage users to try out the new "streams" group feature in lower environments using a "throw-away" broker cluster, ``` ########## content/en/42/getting-started/upgrade.md: ########## @@ -34,6 +34,7 @@ type: docs * The `--max-partition-memory-bytes` option in `kafka-console-producer` is deprecated and will be removed in Kafka 5.0. Please use `--batch-size` instead. * Queues for Kafka ([KIP-932](https://cwiki.apache.org/confluence/x/4hA0Dw)) is production-ready in Apache Kafka 4.2. This feature introduces a new kind of group called share groups, as an alternative to consumer groups. Consumers in a share group cooperatively consume records from topics, without assigning each partition to just one consumer. Share groups also introduce per-record acknowledgement and counting of delivery attempts. Use share groups in cases where records are processed one at a time, rather than as part of an ordered stream. + * The Streams Rebalance Protocol ([KIP-1071](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1071%3A+Streams+Rebalance+Protocol)) is now production-ready for its core feature set. This broker-driven rebalancing system designed specifically for Kafka Streams applications provides faster, more stable rebalances and better observability. For more information about the supported feature set, usage, and migration, please refer to the [Streams developer guide](/{version}/documentation/streams/developer-guide/streams-rebalance-protocol.html). Review Comment: Grammar: “This broker-driven rebalancing system designed …” is missing a verb. Consider rephrasing to “This broker-driven rebalancing system is designed …” to make the sentence read correctly. ```suggestion * The Streams Rebalance Protocol ([KIP-1071](https://cwiki.apache.org/confluence/display/KAFKA/KIP-1071%3A+Streams+Rebalance+Protocol)) is now production-ready for its core feature set. This broker-driven rebalancing system is designed specifically for Kafka Streams applications and provides faster, more stable rebalances and better observability. For more information about the supported feature set, usage, and migration, please refer to the [Streams developer guide](/{version}/documentation/streams/developer-guide/streams-rebalance-protocol.html). ``` ########## content/en/42/streams/developer-guide/app-reset-tool.md: ########## @@ -56,7 +56,7 @@ Prerequisites # Step 1: Run the application reset tool -If you are using **streams rebalance protocol** (available since AK 4.2), use the [Streams groups CLI](kafka-streams-group-sh.html#reset-offsets). +If you are using **streams rebalance protocol** (available since AK 4.2), use the [Streams groups CLI](kafka-streams-group-sh#reset-offsets). Review Comment: This link uses a different internal-URL style than the rest of this page (`manage-topics.html#...`, `security.html#...`, etc.). To keep navigation consistent (and avoid relying on implicit redirects), consider using the same `*.html#...` form here (and ensure the target section anchor matches). ```suggestion If you are using **streams rebalance protocol** (available since AK 4.2), use the [Streams groups CLI](kafka-streams-group-sh.html#reset-offsets). ``` ########## content/en/42/streams/developer-guide/streams-rebalance-protocol.md: ########## @@ -0,0 +1,255 @@ +--- +title: Streams Rebalance Protocol +description: +weight: 14 +tags: ['kafka', 'docs'] +aliases: +keywords: +type: docs +--- + +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +The Streams Rebalance Protocol is a broker-driven rebalancing system designed specifically for Kafka Streams applications. Following the pattern of KIP-848, which moved rebalance coordination of plain consumers from clients to brokers, KIP-1071 extends this model to Kafka Streams workloads. + +# Overview + +Instead of clients computing new assignments on the client during rebalance events involving all members of the group, assignments are computed continuously on the broker. Instead of using a consumer group, the streams application registers as a **streams group** with the broker, which manages and exposes all metadata required for coordination of the streams application instances. + +This approach brings Kafka Streams coordination in line with the modern broker-driven rebalance model introduced for consumers in KIP-848, providing a dedicated group type with streams-specific semantics and metadata management. + +# What's Supported in This Version + +The following features are available in the current release: + +* **Core Streams Group Rebalance Protocol**: The `group.protocol=streams` configuration enables the dedicated streams rebalance protocol. This separates streams groups from consumer groups and provides a streams-specific group membership lifecycle and metadata management on the broker. + +* **Sticky Task Assignor**: A basic task assignment strategy that minimizes task movement during rebalances is included. + +* **Interactive Query Support**: IQ operations are compatible with the new streams protocol. + +* **New Admin RPC**: The StreamsGroupDescribe RPC provides streams-specific metadata separate from consumer group information, with corresponding access via the [`Admin`](/{version}/javadoc/org/apache/kafka/clients/admin/Admin.html) interface. + +* **CLI Integration**: You can list, describe, and delete streams groups via the [bin/kafka-streams-groups.sh](/{version}/streams/developer-guide/kafka-streams-group-sh/) script. + +* **Offline Migration**: After shutting down all members and waiting for their `session.timeout.ms` to expire (or forcing an explicit group leave), a classic group can be converted to a streams group and a streams group can be converted to a classic group. The only broker-side group data that will be preserved are the committed offsets. Internal topics (changelog and repartition topics) will continue to exist as regular Kafka topics. + +# What's Not Supported in This Version + +The following features are not yet available and should be avoided when using the new protocol: + +* **Static Membership**: Setting a client `instance.id` will be rejected. + +* **Topology Updates**: If a topology is changed significantly (e.g., by adding new source topics or changing the number of subtopologies), a new streams group must be created. + +* **High Availability Assignor**: Only the sticky assignor is supported. This implies that "warmup tasks" and rack aware assignment are not supported yet. + +* **Regular Expressions**: Pattern-based topic subscription is not supported. + +* **Online Migration**: Group migration while the application is running is not available between the classic and new streams protocol. + +* **Custom Client Supplier**: Using a custom `KafkaClientSupplier` will only allow so provide restore/global consumer, producer, and admin client. It's not possible to provide the "main" consumer when "streams" groups are enabled. Review Comment: Typo/grammar: “will only allow so provide …” reads incorrectly and is likely missing “to”. Please correct the sentence for clarity (also consider tightening the rest of the sentence around what the custom supplier can/can’t provide). ```suggestion * **Custom Client Supplier**: Using a custom `KafkaClientSupplier` can only customize the restore/global consumers, producers, and admin client; it cannot provide the "main" consumer when "streams" groups are enabled. ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
