hehuiyuan opened a new pull request #23999: Add additional explanation for 
"Setting the max receiving rate" in streaming-programming-guide.md
URL: https://github.com/apache/spark/pull/23999
 
 
   In streaming-programming-guide.md, as follows:
   
   Setting the max receiving rate - If the cluster resources is not large 
enough for the streaming
   application to process data as fast as it is being received, the receivers 
can be rate limited
   by setting a maximum rate limit in terms of records / sec.
   See the configuration parameters
   spark.streaming.receiver.maxRate for receivers and 
spark.streaming.kafka.maxRatePerPartition
   for Direct Kafka approach. In Spark 1.5, we have introduced a feature called 
backpressure that
   eliminate the need to set this rate limit, as Spark Streaming automatically 
figures out the
   rate limits and dynamically adjusts them if the processing conditions 
change. This backpressure
   can be enabled by setting the configuration parameter
   spark.streaming.backpressure.enabled to true.
   
   I think we should be more rigorous. The first batch may be processing all 
the time and can not run normally when the first batch of data is very large 
for Direct Kafka approach .
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to