Hi Azhar,
-dev <mailto:[email protected]> +user <mailto:[email protected]>
this kind of question cannot be answered in general. The overhead will
depend on the job and the SDK you use. Using Java SDK with (classical)
FlinkRunner should give the best performance on Flink, although the
overhead will not be completely nullified. The way Beam is constructed -
with portability being one of the main concerns - necessarily brings
some overhead compared to the job being written and optimized for single
runner only (using Flink's native API in this case). I'd suggest you
evaluate the programming model and portability guarantees, that Apache
Beam gives you instead of pure performance. On the other hand Apache
Beam tries hard to minimize the overhead, so you should not expect
*vastly* worse performance. I'd say the best way to go is to implement a
simplistic Pipeline somewhat representing your use-case and then measure
the performance on this specific instance.
Regarding fault-tolerance and backpressure, Apache Beam model does not
handle those (with the exception of bundles being processed as atomic
units), so these are delegated to the runner - FlinkRunner will
therefore behave the way Apache Flink defines these concepts.
Hope this helps,
Jan
On 10/17/21 17:53, azhar mirza wrote:
Hi Team
Could you please let me know following below answers .
I need to know performance of apache beam vs flink if we use flink as
runner for Beam, what will be the additional overhead converting Beam
to flink
How fault tolerance and resiliency handled in apache beam.
How apache beam handles backpressure?
Thanks
Azhar