Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/spark/pull/19096#discussion_r137778385 --- Diff: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaWriteTask.scala --- @@ -43,8 +43,10 @@ private[kafka010] class KafkaWriteTask( * Writes key value data out to topics. */ def execute(iterator: Iterator[InternalRow]): Unit = { - producer = CachedKafkaProducer.getOrCreate(producerConfiguration) + val paramsSeq = CachedKafkaProducer.paramsToSeq(producerConfiguration) while (iterator.hasNext && failedWrite == null) { + // Prevent producer to get expired/evicted from guava cache.(SPARK-21869) + producer = CachedKafkaProducer.getOrCreate(paramsSeq) --- End diff -- Hi @zsxwing , thanks for looking, I too feel that - it seemed to be the easiest solution though. Anyway, now in the new approach, I am tracking how many threads are currently using the producer. Since guava cache, does not provide a API to prevent an item from being removed. We insert an in use producer back, instead of closing/cleaning it up.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org