Hi,

For the first time I'm trying to set up a standalone cluster. My current configuration
4 server (1 jobmanger and 3 taskmanager)

a) starting the cluster
swissbib@sb-ust1:/swissbib_index/apps/flink/bin$ ./start-cluster.sh
Starting cluster.
Starting standalonesession daemon on host sb-ust1.
Starting taskexecutor daemon on host sb-ust2.
Starting taskexecutor daemon on host sb-ust3.
Starting taskexecutor daemon on host sb-ust4.


On the taskmanager side I get the error
2019-05-01 21:16:32,794 WARN akka.remote.ReliableDeliverySupervisor                        - Association with remote system [akka.ssl.tcp://flink@sb-ust1:6123] has failed, address is now gated for [50] ms. Reason: [class [B cannot be cast to class [C ([B and [C are in module java.base of loader 'bootstrap')] 2019-05-01 21:16:41,932 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify".. 2019-05-01 21:17:01,960 INFO org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not resolve ResourceManager address akka.ssl.tcp://flink@sb-ust1:6123/user/resourcemanager, retrying in 10000 ms: Ask timed out on [ActorSelection[Anchor(akka.ssl.tcp://flink@sb-ust1:6123/), Path(/user/resourcemanager)]] after [10000 ms]. Sender[null] sent message of type "akka.actor.Identify"..


port 6123 is allowed on the jobmanager but I haven't created a specialized flink - user.

- Is this necessary? if yes, is it possible to define another user for communication purposes?

I followed the documentation to setup a ssl based communication (https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/security-ssl.html#example-ssl-setup-standalone-and-kubernetes) and created a keystore as described:

keytool -genkeypair -alias swissbib.internal -keystore internal.keystore -dname "CN=flink.internal" -storepass verysecret -keypass verysecret -keyalg RSA -keysize 4096

and deployed the flink-conf.yaml on the whole cluster

(part of flink-conf.yaml)
security.ssl.internal.enabled: true
security.ssl.internal.keystore: /swissbib_index/apps/flink/conf/internal.keystore security.ssl.internal.truststore: /swissbib_index/apps/flink/conf/internal.keystore
security.ssl.internal.keystore-password: verysecret
security.ssl.internal.truststore-password: verysecret
security.ssl.internal.key-password: verysecret

but this doesn't solve the problem - still no connection between task-managers and job-managers.

- another question: which ports have to be enabled in the firewall for a standalone cluster?

Thanks for any hints!

Günter

Reply via email to