kumarpritam863 opened a new pull request, #15234:
URL: https://github.com/apache/iceberg/pull/15234
### Problem
The current implementation requires users to set a separate
`iceberg.connect.group.id` configuration that **must match** the source
consumer group ID (`consumer.override.group.id` / `group.id`) for the Kafka
Connect coordinator to be elected correctly.
This has caused significant confusion and bugs among developers and users
since the beginning, including:
- Developers forgetting to set `connect.group.id` or setting it incorrectly
- Misunderstanding that the two values are actually required to be identical
- Subtle and hard-to-debug coordination issues
**Problematic scenario example:**
1. Job A is running with `consumer.override.group.id = "x-1"`
2. Job B is submitted with:
- `consumer.override.group.id = "cg"`
- `connect.group.id = "x-1"`
3. Both jobs consume from the **same topic**
**Result:**
Even though the actual consumer group IDs are different (`cg` vs `x-1`) for
job B, the coordinator election still happens based on `connect.group.id =
"x-1"`. This leads to:
- Wrong group being used for coordination
- Incorrect offset commits
- Potential data loss/duplication or reprocessing
- Very confusing behavior that violates the principle of least surprise
### Solution
This PR:
- **Removes** the `connect.group.id` configuration completely
- Always derives the Connect coordinator group ID from the **actual source
consumer group ID** (`group.id` / `consumer.override.group.id`)
- Renames the internal concept/reference from `connectGroupId` →
`sourceConsumerGroupId` for clarity (in code/comments where applicable)
- Updates documentation and configuration validation accordingly
### Benefits
- Eliminates a entire class of misconfiguration bugs
- Removes a redundant and confusing configuration option
- Makes behavior more predictable and intuitive
- Prevents the problematic scenario described above
- Reduces cognitive load for users and maintainers
### Breaking Change
**No** – Even if "iceberg.connect.group.id" is set, it will be ignored and
the correct value will be derived from the consumer group id.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]