I wouldn't say one is, or will always be, in front of or behind another.
That's a great way to phrase it. I think it is very common to jump to
the conclusion that one system is better than the other. In reality it's
often much more complicated.
For example, one of the things Beam has focused on was a language
portability framework. Do I get this with Flink? No. Does that mean Beam
is better than Flink? No. Maybe a better question would be, do I want to
be able to run Python pipelines?
This is just an example, there are many more factors to consider.
Cheers,
Max
On 30.04.19 10:59, Robert Bradshaw wrote:
Though we all certainly have our biases, I think it's fair to say that
all of these systems are constantly innovating, borrowing ideas from
one another, and have their strengths and weaknesses. I wouldn't say
one is, or will always be, in front of or behind another.
Take, as the given example Spark Structured Streaming. Of course the
API itself is spark-specific, but it borrows heavily (among other
things) on ideas that Beam itself pioneered long before Spark 2.0,
specifically the unification of batch and streaming processing into a
single API, and the event-time based windowing (triggering) model for
consistently and correctly handling distributed, out-of-order data
streams.
Of course there are also operational differences. Spark, for example,
is very tied to the micro-batch style of execution whereas Flink is
fundamentally very continuous, and Beam delegates to the underlying
runner.
It is certainly Beam's goal to keep overhead minimal, and one of the
primary selling points is the flexibility of portability (of both the
execution runtime and the SDK) as your needs change.
- Robert
On Tue, Apr 30, 2019 at 5:29 AM <[email protected]> wrote:
Ofcourse! I suspect beam will always be one or two step backwards to the new
functionality that is available or yet to come.
For example: Spark Structured Streaming is still not available, no CEP apis yet
and much more.
Sent from my iPhone
On Apr 30, 2019, at 12:11 AM, Pankaj Chand <[email protected]> wrote:
Will Beam add any overhead or lack certain API/functions available in
Spark/Flink?