gaborgsomogyi commented on a change in pull request #24270:
[SPARK-27343][KAFKA][SS]Avoid hardcoded for spark-sql-kafka-0-10
URL: https://github.com/apache/spark/pull/24270#discussion_r274827760
##########
File path:
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/package.scala
##########
@@ -16,9 +16,33 @@
*/
package org.apache.spark.sql
+import java.util.concurrent.TimeUnit
+
import org.apache.kafka.common.TopicPartition
+import org.apache.spark.internal.config.ConfigBuilder
+
package object kafka010 { // scalastyle:ignore
// ^^ scalastyle:ignore is for ignoring warnings about digits in package name
type PartitionOffsetMap = Map[TopicPartition, Long]
-}
+
+ private[spark] val PRODUCER_CACHE_TIMEOUT =
+ ConfigBuilder("spark.kafka.producer.cache.timeout")
+ .doc("The time to remove the producer when the producer is not used.")
+ .timeConf(TimeUnit.MILLISECONDS)
+ .createWithDefault(TimeUnit.MINUTES.toMillis(10))
+
+ private[spark] val CONSUMER_CACHE_CAPACITY =
+ ConfigBuilder("spark.sql.kafkaConsumerCache.capacity")
+ .doc("The size of LinkedHashMap for caching kafkaConsumers.")
+ .intConf
+ .createWithDefault(64)
+
+ val MAX_OFFSET_PER_TRIGGER = "maxOffsetsPerTrigger"
Review comment:
Still don't see the solution to my second bullet point in the latest change.
`I don't think these parameters should be placed here. Please see
KafkaSourceProvider object.`
I mean here starting from `MIN_PARTITIONS` until `SUBSCRIBE` all lines
should go into `KafkaSourceProvider` object.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]