[ 
https://issues.apache.org/jira/browse/SPARK-19968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Sharma updated SPARK-19968:
------------------------------------
    Description: 
KafkaProducer is thread safe and an instance can be reused for writing every 
batch out. According to Kafka docs, this sort of usage is encouraged. It has 
impact on performance too.

On an average an addBatch operation takes 25ms with this patch. It takes 250+ 
ms without this patch.

Results of benchmark results, posted on github PR.


  was:
KafkaProducer is thread safe and an instance can be reused for writing every 
batch out. According to Kafka docs, this sort of usage is encouraged.

On an average an addBatch operation takes 25ms with this patch and 250+ ms 
without this patch.

Results of benchmark results, posted on github PR.



> Use a cached instance of KafkaProducer for writing to kafka via KafkaSink.
> --------------------------------------------------------------------------
>
>                 Key: SPARK-19968
>                 URL: https://issues.apache.org/jira/browse/SPARK-19968
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.2.0
>            Reporter: Prashant Sharma
>            Assignee: Prashant Sharma
>              Labels: kafka
>
> KafkaProducer is thread safe and an instance can be reused for writing every 
> batch out. According to Kafka docs, this sort of usage is encouraged. It has 
> impact on performance too.
> On an average an addBatch operation takes 25ms with this patch. It takes 250+ 
> ms without this patch.
> Results of benchmark results, posted on github PR.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to