vamossagar12 commented on PR #13376: URL: https://github.com/apache/kafka/pull/13376#issuecomment-1487963852
> @vamossagar12 I don't think this is the right approach. > 1. In the original design, hanging transactions blocking the global offsets topic was considered in detail, and the solution was accepted at the time to be per-connector offsets topics. I think this was in part that the framework couldn't make the right tradeoff of when to abort transactions in a way that was good enough for any use-case. I agree but then if somebody does still use the global offsets topic, this issue remains for them. I thought it might still be useful to fix this. For example, it automatic topic creation is disabled in some environments/enterprises, so whenever a new connector is created a new offsets topic request needs to raised which could be a hassle at times. > 2. Aborting transactions proactively on losing the task assignment treats every rebalance as a failure scenario, and does not permit the task to perform a clean end-of-lifetime-commit, which is explicitly handled via the (internal) TransactionBoundaryManager::shouldCommitFinalTransaction, and is used in all of the different transaction boundary modes. Yeah, in that sense, it's expensive but I have tried to do it only when we detect there are missing workers during a rebalance. So, what you are saying is definitely applicable when workers are coming up and down very frequently. > 3. Fencing zombie source tasks is a very expensive and blocking operation, and it may delay rebalances that otherwise complete nearly instantly. I'm not sure how the cluster would react to assignments taking a significant amount of time to compute, but I don't think it would be more available and better behaved than it is now. I see. I am not fully aware of how costly it is, but I feel that's a price we would have to pay at some point when using global offsets topic. Otherwise, there would always be Zombis lingering around. > 4. This compromises the interface of the IncrementalCooperativeAssignor significantly. Previously, it only had an effect via the return value of performAssignment, and now it would have a side-effect for the (mutable) passed-in Coordinator. Agreed, plz check my comment [here](https://github.com/apache/kafka/pull/13376#issuecomment-1487939356) > 5. This puts a solution to a problem in one implementation of an interface (IncrementalCooperativeAssignor) and not another (EagerAssignor). Since this isn't an Incremental-specific problem, why is the solution only present in incremental mode? Yeah I had assumed that Incremental is more prevalent and hence thought would try to fix it there first. Can always extend it for Eager. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org