hardikbajaj commented on issue #16727: URL: https://github.com/apache/druid/issues/16727#issuecomment-2258230638
Hey @asdf2014 , I am currently working on this. I think the [addDiscoveredTaskToPendingCompletionTaskGroups](https://github.com/apache/druid/blob/954aaafe0c85c1f4967bdb5798c17d4dc813ddd4/indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java#L2485) should be syncrhonized with thread locks during reads too. A lock while updating `pendingCompletionTaskGroups[group_id]` should not block reads or writes for other group ids. And this way we can make sure that all tasks consuming from same partitions are always in a single TaskGroup. As in current state, even if task replication is 3 or 4, there's a chance of having a single task TaskGroup and if it gets failed due to any reason, it will treat as all the publishing tasks have failed. Please let me know if this solution looks good to the PMC team, I'll start implementing it then -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
