[
https://issues.apache.org/jira/browse/SPARK-25721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhaobo Yu updated SPARK-25721:
------------------------------
Attachment: (was: Screenshot from 2018-10-12 14-51-29.png)
> maxRate configuration not being used in Kinesis receiver
> --------------------------------------------------------
>
> Key: SPARK-25721
> URL: https://issues.apache.org/jira/browse/SPARK-25721
> Project: Spark
> Issue Type: Bug
> Components: DStreams
> Affects Versions: 2.2.0
> Reporter: Zhaobo Yu
> Priority: Major
>
> In the onStart() function of KinesisReceiver class, the
> KinesisClientLibConfiguration object is initialized in the following way,
> val kinesisClientLibConfiguration = new KinesisClientLibConfiguration(
> checkpointAppName,
> streamName,
> kinesisProvider,
> dynamoDBCreds.map(_.provider).getOrElse(kinesisProvider),
> cloudWatchCreds.map(_.provider).getOrElse(kinesisProvider),
> workerId)
> .withKinesisEndpoint(endpointUrl)
> .withInitialPositionInStream(initialPositionInStream)
> .withTaskBackoffTimeMillis(500)
> .withRegionName(regionName)
>
> As you can see there is no withMaxRecords() in initialization, so
> KinesisClientLibConfiguration will set it to 10000 by default since it has
> been hard coded as this way,
> public static final int DEFAULT_MAX_RECORDS = 10000;
> In such a case, the receiver will not fulfill any maxRate setting we set,
> worse still, it will cause ProvisionedThroughputExceededException from
> Kinesis, especially when we restart the streaming application.
>
> Attached
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]