Patrick McGloin created SPARK-25466: ---------------------------------------
Summary: Documentation does not specify how to set fafka consumer cache capacity for SS Key: SPARK-25466 URL: https://issues.apache.org/jira/browse/SPARK-25466 Project: Spark Issue Type: Bug Components: Structured Streaming Affects Versions: 2.3.0 Reporter: Patrick McGloin When hitting this warning with SS: 19-09-2018 12:05:27 WARN CachedKafkaConsumer:66 - KafkaConsumer cache hitting max capacity of 64, removing consumer for CacheKey(spark-kafka-source-e06c9676-32c6-49c4-80a9-2d0ac4590609--694285871-executor,MyKafkaTopic-30) If you Google you get to this page: https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html Which is for Spark Streaming and says to use this config item to adjust the capacity: "spark.streaming.kafka.consumer.cache.maxCapacity". This is a bit confusing as SS uses a different config item: "spark.sql.kafkaConsumerCache.capacity" Perhaps the SS Kafka documentation should talk about the consumer cache capacity? Perhaps here? https://spark.apache.org/docs/2.2.0/structured-streaming-kafka-integration.html Or perhaps the warning message should reference the config item. E.g 19-09-2018 12:05:27 WARN CachedKafkaConsumer:66 - KafkaConsumer cache hitting max capacity of 64, removing consumer for CacheKey(spark-kafka-source-e06c9676-32c6-49c4-80a9-2d0ac4590609--694285871-executor,MyKafkaTopic-30). *The cache size can be adjusted with the setting "spark.sql.kafkaConsumerCache.capacity".* -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org