Kaspar Tint created SPARK-26396:
-----------------------------------

             Summary: Kafka consumer cache overflow since 2.4.x
                 Key: SPARK-26396
                 URL: https://issues.apache.org/jira/browse/SPARK-26396
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 2.4.0
         Environment: Spark 2.4 standalone client mode
            Reporter: Kaspar Tint


We are experiencing an issue where the Kafka consumer cache seems to overflow 
constantly upon starting the application. This issue appeared after upgrading 
to Spark 2.4.

We would get constant warnings like this:
{code:java}
18/12/18 07:03:29 WARN KafkaDataConsumer: KafkaConsumer cache hitting max 
capacity of 180, removing consumer for 
CacheKey(spark-kafka-source-6f66e0d2-beaf-4ff2-ade8-8996611de6ae--1081651087-executor,kafka-topic-76)
18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max 
capacity of 180, removing consumer for 
CacheKey(spark-kafka-source-6f66e0d2-beaf-4ff2-ade8-8996611de6ae--1081651087-executor,kafka-topic-30)
18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max 
capacity of 180, removing consumer for 
CacheKey(spark-kafka-source-f41d1f9e-1700-4994-9d26-2b9c0ee57881--215746753-executor,kafka-topic-57)
18/12/18 07:03:32 WARN KafkaDataConsumer: KafkaConsumer cache hitting max 
capacity of 180, removing consumer for 
CacheKey(spark-kafka-source-f41d1f9e-1700-4994-9d26-2b9c0ee57881--215746753-executor,kafka-topic-43)
{code}

This application is running 4 different Spark Structured Streaming queries 
against the same Kafka topic that has 90 partitions. We used to run it with 
just the default settings so it defaulted to cache size 64 on Spark 2.3 but now 
we tried to put it to 180 or 360. With 360 we will have a lot less noise about 
the overflow but resource need will increase substantially.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to