gaborgsomogyi commented on a change in pull request #23999: [docs]Add 
additional explanation for "Setting the max receiving rate" in 
streaming-programming-guide.md
URL: https://github.com/apache/spark/pull/23999#discussion_r264248341
 
 

 ##########
 File path: docs/streaming-programming-guide.md
 ##########
 @@ -2036,7 +2036,7 @@ To run a Spark Streaming applications, you need to have 
the following.
   `spark.streaming.receiver.maxRate` for receivers and 
`spark.streaming.kafka.maxRatePerPartition`
   for Direct Kafka approach. In Spark 1.5, we have introduced a feature called 
*backpressure* that
   eliminate the need to set this rate limit, as Spark Streaming automatically 
figures out the
-  rate limits and dynamically adjusts them if the processing conditions 
change. This backpressure
+  rate limits and dynamically adjusts them if the processing conditions 
change.If the first batch of data is very large which causes the first batch is 
processing all the time and the task can not work normally , using a maximum 
rate limit can solve the problem .This backpressure
 
 Review comment:
   I see the intention but I agree with Sean and think this change doesn't make 
the doc better. I agree that if the first batch processing time is 
significantly bigger than the batch period then microbatches can be queued up 
but I would rephrase things.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to