HeartSaVioR commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-478187752 I had some kind of wondering that how much beneficial the complexity we are adding is, because the current code is a kind of no-brainer so it has benefits with no specific concern, but now it concerns about concurrency - actually not feeling much from the code diff (maybe because they were addressed), but from review comments concerning scope of synchronization, race condition, deadlock, etc. While I mentioned general connection pool, I basically meant thread-isolated usage. I wouldn't concern producer takes same approach as consumer with current custom cache code. (Though I think getting rid of custom cache code would be ideal.) But you're right I agree someone can manage the complexity (given there're reviewers and at least one committers will review and merge) when this patch is merged, then my wondering is not a big deal. And while we don't know clearly about benefits on current approach against thread-isolated usage (like perf. benefit via co-use of producer), we also don't know clearly about opposite way. So let's leave it as a second-thought.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
