Re: Performance of Apache Beam

Jan Lukavský Mon, 18 Oct 2021 02:30:19 -0700

Hi Azhar,

-dev <mailto:[email protected]> +user <mailto:[email protected]>

this kind of question cannot be answered in general. The overhead willdepend on the job and the SDK you use. Using Java SDK with (classical)FlinkRunner should give the best performance on Flink, although theoverhead will not be completely nullified. The way Beam is constructed -with portability being one of the main concerns - necessarily bringssome overhead compared to the job being written and optimized for singlerunner only (using Flink's native API in this case). I'd suggest youevaluate the programming model and portability guarantees, that ApacheBeam gives you instead of pure performance. On the other hand ApacheBeam tries hard to minimize the overhead, so you should not expect*vastly* worse performance. I'd say the best way to go is to implement asimplistic Pipeline somewhat representing your use-case and then measurethe performance on this specific instance.

Regarding fault-tolerance and backpressure, Apache Beam model does nothandle those (with the exception of bundles being processed as atomicunits), so these are delegated to the runner - FlinkRunner willtherefore behave the way Apache Flink defines these concepts.


Hope this helps,

 Jan

On 10/17/21 17:53, azhar mirza wrote:

Hi Team
Could you please let me know following below answers .
I need to know performance of apache beam vs flink if we use flink asrunner for Beam, what will be the additional overhead converting Beamto flink
How fault tolerance and resiliency handled in apache beam.
How apache beam handles backpressure?

Thanks
Azhar

Re: Performance of Apache Beam

Reply via email to