Patrick McGloin created SPARK-25466:
---------------------------------------

             Summary: Documentation does not specify how to set fafka consumer 
cache capacity for SS
                 Key: SPARK-25466
                 URL: https://issues.apache.org/jira/browse/SPARK-25466
             Project: Spark
          Issue Type: Bug
          Components: Structured Streaming
    Affects Versions: 2.3.0
            Reporter: Patrick McGloin


When hitting this warning with SS:

19-09-2018 12:05:27 WARN  CachedKafkaConsumer:66 - KafkaConsumer cache hitting 
max capacity of 64, removing consumer for 
CacheKey(spark-kafka-source-e06c9676-32c6-49c4-80a9-2d0ac4590609--694285871-executor,MyKafkaTopic-30)

If you Google you get to this page:

https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html

Which is for Spark Streaming and says to use this config item to adjust the 
capacity: "spark.streaming.kafka.consumer.cache.maxCapacity".

This is a bit confusing as SS uses a different config item: 
"spark.sql.kafkaConsumerCache.capacity"

Perhaps the SS Kafka documentation should talk about the consumer cache 
capacity?  Perhaps here?

https://spark.apache.org/docs/2.2.0/structured-streaming-kafka-integration.html

Or perhaps the warning message should reference the config item.  E.g

19-09-2018 12:05:27 WARN  CachedKafkaConsumer:66 - KafkaConsumer cache hitting 
max capacity of 64, removing consumer for 
CacheKey(spark-kafka-source-e06c9676-32c6-49c4-80a9-2d0ac4590609--694285871-executor,MyKafkaTopic-30).
  *The cache size can be adjusted with the setting 
"spark.sql.kafkaConsumerCache.capacity".*





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to