Re: Dynamic topologies / replay

Raghu Angadi Fri, 17 Aug 2018 10:22:13 -0700

On Thu, Aug 16, 2018 at 2:20 PM Thomas Browne <[email protected]> wrote:


> I just watched the excellent presentation by Markku Leppisto in Singapore.
> I consult for a financial broker in London.
> Our use case is streaming financial data on which various analytics are
> performed to find relative value trading opportunities in the fixed income
> markets.
>
> I have two questions about Apache Beam:
>
> a) Does it, or will it, support dynamic topologies? We have many analysts
> all of whom want slight variations of the topologies, and they may want to
> change the topologies, or the function ("node") parameters, at runtime. Is
> this possible, and if not is "second prize" of simply growing the DAG with
> non-recombinant node branches at runtime, possible?
>

Topologies are fixed. Some form of "second prize" is probably doable, but I
think the pipeline author needs to handle it at runtime.

b) Does, or will Apache beam support "replays"? We find quite often that
> historical inputs change, a long time later (for example, a bond price from
> a few weeks ago is seen to have been erroneous, and is changed). We then
> have to re-run from that point forward as all downstream calcs are
> dependent on all prices. I believe Apache Flink supports this type of
> functionality? Is this possilble with Beam?
>

A streaming pipeline can be restarted from earlier resume point as long as
the sources support it. Other than that, I don't think there is much more
explicit support (e.g. in-built support to combine updated historical
aggregates to future updates etc). Flink's save-points should help with
Beam on Flink as well, it might make it simpler to save multiple snapshots.


> I am evaluating the various dataflow programming environments that are
> springing up and will advise based partly on the above.
>
> Many thanks,
>
> Thomas
>
>

Re: Dynamic topologies / replay

Reply via email to