linehrr commented on a change in pull request #23666: [SPARK-26718][SS] Fixed
integer overflow in SS kafka rateLimit calculation
URL: https://github.com/apache/spark/pull/23666#discussion_r251589732
##########
File path:
external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchSourceSuite.scala
##########
@@ -204,6 +204,43 @@ abstract class KafkaMicroBatchSourceSuiteBase extends
KafkaSourceSuiteBase {
StopStream)
}
+ test("Rate limit set to Long.Max should not overflow integer " +
+ "during end offset calculation[SPARK-26718]") {
+ val topic = newTopic()
+ testUtils.createTopic(topic, partitions = 1)
+ // fill in 5000 messages to trigger potential integer overflow
+ testUtils.sendMessages(topic, (0 until 5000).map(_.toString).toArray,
Some(0))
+
+ val partitionOffsets = Map(
+ new TopicPartition(topic, 0) -> 5000L
+ )
+ val startingOffsets = JsonUtils.partitionOffsets(partitionOffsets)
+
+ val sparkSession = spark
+ import sparkSession.implicits._
+ val kafka = spark
+ .readStream
+ .format("kafka")
+ .option("kafka.bootstrap.servers", testUtils.brokerAddress)
+ // use latest to force begin to be 5000
+ .option("startingOffsets", startingOffsets)
+ // use Long.Max to try to trigger overflow
+ .option("maxOffsetsPerTrigger", Long.MaxValue.toString)
Review comment:
oh, good point. actually wish SparkConf.set() has such overloads as well,
LOL.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]