You can find longer explanations for backpressure on teh Interwebs, but briefly,
1. Buffers are never infinite. Even if you're willing to write and maintain a nontrivial scheme for flushing to disk, you might still have a major failure scenario where the consumer is gone forever. Depending on volume and disk size, you could run out of space quickly. Then what should you do? Does this low-level producer know what makes sense from a strategic perspective? The answers lead you to... 2. Backpressure composes. If you have a graph of backpressure-enabled streams, when one slows down, the backpressure can propagate up stream. Somewhere at the beginning of the graph, someone has to decide what to do about a major slow down, like in the previous scenario, but now it's a strategic concern that you can solve at the architecture level, rather than forcing some low-level stream component make an arbitrary decision about what to do. dean On Thu, Jan 12, 2017 at 9:36 AM, Gábor Gévay <gga...@gmail.com> wrote: > Hello, > > I would like to ask about the rationale behind the backpressure > mechanism in Flink. > > As I understand it, backpressure is for handling the problem of one > operator (or source) producing records faster then the next operator > can consume them. However, an alternative solution would be to have a > potentially "infinite" buffer for the incoming records of an operator, > by spilling the buffer to disk when it is getting too large. Could you > tell me why is backpressure considered a better option? > > Also, I would be interested in whether going with backpressure was a > streaming-specific decision, or do you think that having backpressure > is also better in batch jobs? > > Thanks, > Gábor > -- *Dean Wampler, Ph.D.* Fast Data Product Architect, Office of the CTO dean.wamp...@lightbend.com Lightbend <http://lightbend.com> @deanwampler <http://twitter.com/deanwampler> https://www.linkedin.com/in/deanwampler