Hi

I had similar use-case recently, and adding a metadata key solved the issue 
https://github.com/GoogleCloudDataproc/initialization-actions/pull/334. You 
keep the original initialization action and add for example (using gcloud) 
'--metadata 
flink-snapshot-url=http://mirrors.up.pt/pub/apache/flink/flink-1.9.1/flink-1.9.1-bin-scala_2.11.tgz'

Cheers
Pawel
________________________________
From: Ismaël Mejía <[email protected]>
Sent: Friday, February 7, 2020 2:24 PM
To: Xander Song <[email protected]>; [email protected] 
<[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Running a Beam Pipeline on GCP Dataproc Flink Cluster

[email protected]<mailto:[email protected]>


On Fri, Feb 7, 2020 at 12:54 AM Xander Song 
<[email protected]<mailto:[email protected]>> wrote:
I am attempting to run a Beam pipeline on a GCP Dataproc Flink cluster. I have 
followed the instructions at this 
repo<https://github.com/GoogleCloudDataproc/initialization-actions/tree/master/flink>
 to create a Flink cluster on Dataproc using an initialization action. However, 
the resulting cluster uses version 1.5.6 of Flink, and my project requires a 
more recent version (version 1.7, 1.8, or 1.9) for compatibility with 
Beam<https://beam.apache.org/documentation/runners/flink/>.

Inside of the flink.sh script in the linked repo, there is a line for 
installing Flink from a snapshot URL instead of 
apt<https://github.com/GoogleCloudDataproc/initialization-actions/blob/81e453d8f8a036e371e144d5103aaa38ecb2c679/flink/flink.sh#L53>.
 Is this the correct mechanism for installing a different version of Flink 
using the initialization script? If so, how is it meant to be used?

Thank you in advance.

Reply via email to