[
https://issues.apache.org/jira/browse/SPARK-17938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15578133#comment-15578133
]
Cody Koeninger commented on SPARK-17938:
----------------------------------------
There was pretty extensive discussion of this on list, should link or summarize
it.
Couple of things here:
100 is the default minimum rate for pidestimator. If you're willing to write
code, put more logging in to determine why that rate isn't being configured, or
hardcode it to a different number. I have successfully adjusted that rate using
spark configuration.
The other thing is that if your system takes way longer than 1 second to
process 100k records, 100k obviously isn't a reasonable max. Many large batches
will be defined during the time that first batch is running, before back
pressure is involved at all. Try a lower max.
> Backpressure rate not adjusting
> -------------------------------
>
> Key: SPARK-17938
> URL: https://issues.apache.org/jira/browse/SPARK-17938
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 2.0.0, 2.0.1
> Reporter: Samy Dindane
>
> spark-streaming 2.0.1 and spark-streaming-kafka-0-10 version is 2.0.1. Same
> behavior with 2.0.0 though.
> spark.streaming.kafka.consumer.poll.ms is set to 30000
> spark.streaming.kafka.maxRatePerPartition is set to 100000
> spark.streaming.backpressure.enabled is set to true
> `batchDuration` of the streaming context is set to 1 second.
> I consume a Kafka topic using KafkaUtils.createDirectStream().
> My system can handle 100k records batches, but it'd take more than 1 seconds
> to process them all. I'd thus expect the backpressure to reduce the number of
> records that would be fetched in the next batch to keep the processing delay
> inferior to 1 second.
> Only this does not happen and the rate of the backpressure stays the same:
> stuck in `100.0`, no matter how the other variables change (processing time,
> error, etc.).
> Here's a log showing how all these variables change but the chosen rate stays
> the same: https://gist.github.com/Dinduks/d9fa67fc8a036d3cad8e859c508acdba (I
> would have attached a file but I don't see how).
> Is this the expected behavior and I am missing something, or is this a bug?
> I'll gladly help by providing more information or writing code if necessary.
> Thank you.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]