kumarpritam863 opened a new pull request, #15234:
URL: https://github.com/apache/iceberg/pull/15234

   ### Problem
   
   The current implementation requires users to set a separate 
`iceberg.connect.group.id` configuration that **must match** the source 
consumer group ID (`consumer.override.group.id` / `group.id`) for the Kafka 
Connect coordinator to be elected correctly.
   
   This has caused significant confusion and bugs among developers and users 
since the beginning, including:
   
   - Developers forgetting to set `connect.group.id` or setting it incorrectly
   - Misunderstanding that the two values are actually required to be identical
   - Subtle and hard-to-debug coordination issues
   
   **Problematic scenario example:**
   
   1. Job A is running with `consumer.override.group.id = "x-1"`
   2. Job B is submitted with:
      - `consumer.override.group.id = "cg"`
      - `connect.group.id = "x-1"`
   3. Both jobs consume from the **same topic**
   
   **Result:**
   Even though the actual consumer group IDs are different (`cg` vs `x-1`) for 
job B, the coordinator election still happens based on `connect.group.id = 
"x-1"`. This leads to:
   - Wrong group being used for coordination
   - Incorrect offset commits
   - Potential data loss/duplication or reprocessing
   - Very confusing behavior that violates the principle of least surprise
   
   ### Solution
   
   This PR:
   
   - **Removes** the `connect.group.id` configuration completely
   - Always derives the Connect coordinator group ID from the **actual source 
consumer group ID** (`group.id` / `consumer.override.group.id`)
   - Renames the internal concept/reference from `connectGroupId` → 
`sourceConsumerGroupId` for clarity (in code/comments where applicable)
   - Updates documentation and configuration validation accordingly
   
   ### Benefits
   
   - Eliminates a entire class of misconfiguration bugs
   - Removes a redundant and confusing configuration option
   - Makes behavior more predictable and intuitive
   - Prevents the problematic scenario described above
   - Reduces cognitive load for users and maintainers
   
   ### Breaking Change
   
   **No** – Even if "iceberg.connect.group.id" is set, it will be ignored and 
the correct value will be derived from the consumer group id.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to