In Beam, the Kafka connector does not know anything about the underlying execution engine (here Flink). It is instead translated by the runner into a user defined function in Flink. So it is expected that the resulting DAG does not look the same as it would with a native Flink source.
On Fri, Feb 26, 2021 at 5:18 AM yilun zhang <[email protected]> wrote: > So sorry for subscribing errors on my side resulted in multiple duplicate > email! > > Thanks for reply and it does help! > > I am confused when submitting beam job with kafka connector to flink, I > noticed that flink DAG diagram will included readFromKafka as part of flink > workflow. while if we submit a pyflink job(connected with kafka) directly > to flink, the flink workflow will exclude reading from kafka(which is the > resource) but only has data processing parts. > > Is that how beam want flink to do? > > Thanks a lot and sincerely apologize again for silly duplicated emails! > > Yilun > > Sam Bourne <[email protected]>于2021年2月25日 周四上午11:58写道: > >> Hi Yilun! >> >> I made a quick proof of concept repo showcasing how to run a beam >> pipeline in flink on k8s. It may be useful for you as reference. >> >> https://github.com/sambvfx/beam-flink-k8s >> >> >> On Wed, Feb 24, 2021, 8:13 AM yilun zhang <[email protected]> wrote: >> >>> Hey, >>> >>> Our team is trying to use beam with connector Kafka and runner flink to >>> gather information and process data. We adopt python sdk and build in java >>> 11 in python 3.7 sdk image as java runtime for kafka expansion service. >>> so : >>> image: beam python 3.7 docker image + build in java 11 >>> connector: kafka >>> runner: flink >>> container: kubernetes >>> >>> We encounter an docker not found error when running: >>> python3 -m kafka_test --runner=FlinkRunner >>> --flink_master=flink-job-manager:8081 --flink_submit_uber_jar >>> --environment_type=EXTERNAL --environment_config=localhost:50000 >>> >>> We notice that in https://beam.apache.org/roadmap/portability/ it >>> mentioned the prerequisite also includes Docker. We wonder what is the >>> docker usage here? Is there any suggested way to build docker in >>> k8s container? (something maybe like sysbox for docker in docker?) >>> >>> Or maybe we should not use beam sdk+runner in k8s? >>> >>> Thanks, >>> Yilun >>> >>
