Re: How to run a per-job cluster for a Beam pipeline w/ FlinkRunner on YARN

Aljoscha Krettek Tue, 17 Nov 2020 00:17:08 -0800

Hi,

to ensure that we really are using per-job mode, could you try and use


$ flink run -t yarn-per-job -d <...>

This will directly specify that we want to use the YARN per-jobexecutor, which bypasses some of the logic in the older YARN code pathsthat differentiate between YARN session mode and YARN per-job mode.


Best,
Aljoscha

On 17.11.20 07:02, Tzu-Li (Gordon) Tai wrote:

Hi,

Not sure if this question would be more suitable for the Apache Beam
mailing lists, but I'm pulling in Aljoscha (CC'ed) who would know more
about Beam and whether or not this is an expected behaviour.

Cheers,
Gordon

On Mon, Nov 16, 2020 at 10:35 PM Dongwon Kim <eastcirc...@gmail.com> wrote:

Hi,

I'm trying to run a per-job cluster for a Beam pipeline w/ FlinkRunner on
YARN as follows:

flink run -m yarn-cluster -d \


     my-beam-pipeline.jar \

     --runner=FlinkRunner \
     --flinkMaster=[auto] \
     --parallelism=8



Instead of creating a per-job cluster as wished, the above command seems
to create a session cluster and then submit a job onto the cluster.

I doubt it because
(1) In the attached file, there's "Submit New Job" which is not shown in
other per-job applications that are written in Flink APIs and submitted to
YARN similar to the above command.

[image: beam on yarn.png]
(2) When the job is finished, the YARN application is still in its RUNNING
state without being terminated. I had to kill the YARN application manually.

FYI, I'm using
- Beam v2.24.0 (Flink 1.10)
- Hadoop v3.1.1

Thanks in advance,

Best,

Dongwon

Re: How to run a per-job cluster for a Beam pipeline w/ FlinkRunner on YARN

Reply via email to