gaborgsomogyi commented on issue #25853: [SPARK-21869][SS] Apply Apache Commons 
Pool to Kafka producer
URL: https://github.com/apache/spark/pull/25853#issuecomment-534519947
 
 
   > What happens to the old instance?
   
   If none of the tasks are using the old instance, then after the eviction 
time elapsed it will be closed and removed from cache.
   
   > Is there a case, where we would use more than one kafka producer at a time 
?
   
   Not sure I understand your question exactly. Let me try to answer my 
interpretation. Kafka connection pooling solves mainly one problem, namely it 
can spare the construction time of consumer/producer instances. This creation 
time can be significant when kerberos and SSL encryption is enabled which would 
happen in every micro-batch.
   
   As you've noted producers are thread safe so as a further optimization, 
instances can be shared between threads without any harm. Since Apache Commons 
Pool doesn't support this it can't be done here. If multiple threads are using 
a producer with the same Kafka params then multiple instances will be created 
(same happens just like the consumer side). This is a trade-off what I think 
makes sense comparing it with the main advantages listed in the PR description.
   
   If you mean something else, please clarify and we can discuss it...
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to