Hi Matt I had already seen your talks (not yet watched it in it's entirety. Will do tomorrow.
Regards Hans-Peter Op di 18 jan. 2022 om 15:59 schreef Matt Casters <[email protected]>: > Hi Hans-Peter, > > "Flink" is typically a cluster of Flink servers. You can run pipelines on > it which process data in a streaming or batch fashion in a safe and > parallel way. > Apache Beam is a general API for running these pipelines. > > So now with Apache Hop you can design pipelines in a graphical way (or > with an SDK for that matter) and this is then represented as pure metadata. > The metadata describes the things that need to happen in a pipeline: the > transforms and the way they are connected with hops. > > For more details I can point you to our getting started with Beam page: > https://hop.apache.org/manual/latest/pipeline/beam/getting-started-with-beam.html > I also did a 2h workshop for the Apache Beam Summit last summer: > https://www.youtube.com/watch?v=sZSIbcPtebI > > The gist of it, to come back to your question, is that you can design > pipelines which are capable of executing on Apache Flink, Apache Spark or > indeed GCP Dataflow. > The exact same pipeline metadata can run on all platforms and can be unit > tested as well for example. > We accomplished this by wrapping all the platform specific options in what > we call Pipeline Run Configurations. For example, here is the > documentation for the Apache Flink run configuration: > https://hop.apache.org/manual/latest/pipeline/pipeline-run-configurations/beam-flink-pipeline-engine.html > > > This way you can simply specify how (on which platform) you want to run > your pipeline and Hop will take care of it. All the software and libraries > that you need from Apache Hop, Beam and Flink are included if you use Hop. > > Cheers, > Matt > > On Tue, Jan 18, 2022 at 2:57 PM HG <[email protected]> wrote: > >> Hi, >> >> One question which has not yet become entirely clear to me. >> I want to run jobs on Flink. >> Do I also need to install and use Beam? >> >> On the page : What is beam it says: >> Pipelines can also run on Apache Spark, Apache Flink and Google Dataflow >> through the Apache Beam runtime configurations. >> >> What does this mean in practice? >> >> Regards Hans-Peter >> > > > > > >
