[
https://issues.apache.org/jira/browse/SPARK-7398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606393#comment-14606393
]
Tathagata Das commented on SPARK-7398:
--------------------------------------
Thank you very much! This is much cleaner and easy to understand. Here are
the preliminary thoughts.
1. I strongly recommend introducing an interface for making the heuristic
algorithm pluggable.
2. Should there be separate instances of the heuristic algorithm running
for separate input DStreams. That part is not clear in the doc. Different
input stream may want to calculate their rates differently.
3. The calculated rate from the heuristic algorithm should be communicated
through the input Dstream interface. This is to allow both receiver and
non-receiver input DStreams to handle calculate rates differently.
4. The CongestionListener interface is a good-to-have in the future but not
the most critical feature. So I suggest this be done later, in future
iterations.
5. There should not be any new dependencies in the streaming module.
The rest of the details are commented in the doc.
On Mon, Jun 29, 2015 at 2:22 AM, Iulian Dragos (JIRA) <[email protected]>
> Add back-pressure to Spark Streaming
> ------------------------------------
>
> Key: SPARK-7398
> URL: https://issues.apache.org/jira/browse/SPARK-7398
> Project: Spark
> Issue Type: Improvement
> Components: Streaming
> Affects Versions: 1.3.1
> Reporter: François Garillot
> Priority: Critical
> Labels: streams
>
> Spark Streaming has trouble dealing with situations where
> batch processing time > batch interval
> Meaning a high throughput of input data w.r.t. Spark's ability to remove data
> from the queue.
> If this throughput is sustained for long enough, it leads to an unstable
> situation where the memory of the Receiver's Executor is overflowed.
> This aims at transmitting a back-pressure signal back to data ingestion to
> help with dealing with that high throughput, in a backwards-compatible way.
> The original design doc can be found here:
> https://docs.google.com/document/d/1ZhiP_yBHcbjifz8nJEyPJpHqxB1FT6s8-Zk7sAfayQw/edit?usp=sharing
> The second design doc (without all the background info, and more centered on
> the implementation) can be found here:
> https://docs.google.com/document/d/1ls_g5fFmfbbSTIfQQpUxH56d0f3OksF567zwA00zK9E/edit?usp=sharing
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]