gaborgsomogyi commented on issue #19096: [SPARK-21869][SS] A cached Kafka producer should not be closed if any task is using it - adds inuse tracking. URL: https://github.com/apache/spark/pull/19096#issuecomment-528431633 https://github.com/apache/spark/pull/22138 has been merged which changed my view on how to solve this issue here (until know I was not sure committers have enough confidence to merge that new technology). Proposal: use Apache Commons Pool My considerations: * Such way we don't have to do reference counting manually (reflecting to @jose-torres concerns) * Monitoring Kafka consumer/producer cache is on my table for long time. Apache Commons Pool by default provides metrics. * TD suggested this when the consumer side PR was filed/merged. Of course he made his suggestion on the consumer side but his reasoning still applies here (no manual reference counting). If you agree happy to give helping hand during review. @ScrapCodes if you don't have time to invest then I'm happy to do the coding part. My PR https://github.com/apache/spark/pull/23956 is depending on this for long time and would like to push this forward (not all the cases will delegation token work). Guys, please share your thoughts.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
