Re: beam + scala + streamline

Davor Bonaci Thu, 13 Apr 2017 08:57:34 -0700

Hi Georg --
Great to see you are evaluating Beam for your scenario.


> > someone told me that e.g. the flink runner for beam seems to be slower
>> than a
>> > native flink job. Is this true? Did you observe such characteristics
>> for several
>> > runners?
>>
>
This should not be true in a general sense -- the performance should be
~equivalent. The Flink runner in Beam constructs a "native" Flink pipeline;
the overhead of invoking user-defined functions is often to set a few
fields and invoke a function, which is negligible. The actual performance
of a pipeline tend to depend on other factors -- stragglers, how fast the
system can adopt to changing load, etc.

(If there's a gap somewhere, it is likely a bug -- and we'd like to know
about it and fix it.)

> in case I want to use some low level functionality (specific to a runner)
>> like
>> > ML, graph processing or sql-tables api, is it possible to just drop
>> from the
>> > beam API one level deeper to the actual runner and sort of mesh beam
>> with runner
>> > native code to integrate these features?
>
>
The Beam API, in a general sense, doesn't provide such hooks, as that would
break portability.

I wouldn't advise this, but technically, it wouldn't be hard -- you'd
create a PTransform in Beam, and modify the runner to replace it with their
own specific implementation. Instead, I'd suggest using Beam's abstractions
and, in the case of a missing pattern or a feature, to work with us to
augment the Beam model accordingly.

Hope this helps -- and that you find Beam fitting for your case. Please let
us know if we can assist any further -- thanks!

Davor

Re: beam + scala + streamline

Reply via email to