[GitHub] [spark] ScrapCodes commented on issue #25853: [SPARK-21869][SS] Apply Apache Commons Pool to Kafka producer

GitBox Tue, 24 Sep 2019 03:03:44 -0700

ScrapCodes commented on issue #25853: [SPARK-21869][SS] Apply Apache Commons 
Pool to Kafka producer
URL: https://github.com/apache/spark/pull/25853#issuecomment-534488064
 
 
   Thanks for your interest in redoing the entire patch from the start, you had 
to re-write the test suites as well to fit the new pool API used. 
   
   Commons pool seems to wrap each pooled object with a PooledObect wrapper and 
keeps all the “inuse tracking” information inside that per object wrapper. 
Since guava cache does not do it, we had to add this tracking ourself 
earlier(in my previous PR). So definitely, this is better than using guava for 
this tracking. 
   Producers are cached by the kafka parameters they are created with, in other 
words, if kafka params change, we get a fresh instance of KafkaProducer from 
the cache. What happens to the old instance? It sits in the cache till the 
object expires and gets evicted thereafter. Since Kafka Producer is thread 
safe, it is shared across all the threads on the executor.
   
   Q. Is there a case, where we would use more than one kafka producer at a 
time ? if no, then why do we need object pooling? If yes, when would that 
happen?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] ScrapCodes commented on issue #25853: [SPARK-21869][SS] Apply Apache Commons Pool to Kafka producer

Reply via email to