[
https://issues.apache.org/jira/browse/FLINK-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17566640#comment-17566640
]
Sylvia Lin commented on FLINK-24031:
------------------------------------
[~wangyang0918] Yeah, i'm using below configMap, and the exact same thing work
for another EKS cluster. I can confirm for the EKS cluster doesn't work
correctly, it cannot resolve host flink-jobmanager, other dns resolution works
fine on the same cluster:
{code:java}
~$ curl flink-jobmanager
curl: (6) Could not resolve host: flink-jobmanager {code}
configMap:
{code:java}
apiVersion: v1
kind: ConfigMap
metadata:
name: flink-config
labels:
app: flink
data:
flink-conf.yaml: |+
kubernetes.cluster-id: <cluster_name>
fs.allowed-fallback-filesystems: s3
state.backend: rocksdb
state.backend.incremental: true
state.backend.local-recovery: true
jobmanager.rpc.address: flink-jobmanager
taskmanager.numberOfTaskSlots: 2
blob.server.port: 6124
jobmanager.rpc.port: 6123
taskmanager.rpc.port: 6122
jobmanager.memory.process.size: 1600m
taskmanager.memory.process.size: 1728m
restart-strategy: fixeddelay
restart-strategy.fixed-delay.attempts: 100000
scheduler-mode: reactive
metrics.reporter.prom.class:
org.apache.flink.metrics.prometheus.PrometheusReporter
heartbeat.timeout: 8000
heartbeat.interval: 5000
rest.flamegraph.enabled: true
hive.s3.use-instance-credentials: true {code}
> I am trying to deploy Flink in kubernetes but when I launch the taskManager
> in other container I get a Exception
> ----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-24031
> URL: https://issues.apache.org/jira/browse/FLINK-24031
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes
> Affects Versions: 1.13.0, 1.13.2
> Reporter: Julio Pérez
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.13.1
>
> Attachments: flink-map.yml, jobmanager.log, jobmanager.yml,
> taskmanager.log, taskmanager.yml
>
>
> I explain here -> [https://github.com/apache/flink/pull/17020]
> I have a problem when I try to run Flink in k8s with the follow manifests
> I have the following exception
> # JobManager :
> {quote}2021-08-27 09:16:57,917 ERROR akka.remote.EndpointWriter [] - dropping
> message [class akka.actor.ActorSelectionMessage] for non-local recipient
> [Actor[akka.tcp://flink@jobmanager-hs:6123/]] arriving at
> [akka.tcp://flink@jobmanager-hs:6123] inbound addresses are
> [akka.tcp://flink@cluster:6123]
> 2021-08-27 09:17:01,255 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2021-08-27 09:17:01,284 DEBUG
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager [] -
> Trigger heartbeat request.
> 2021-08-27 09:17:10,008 DEBUG akka.remote.transport.netty.NettyTransport []
> - Remote connection to [/172.17.0.1:34827] was disconnected because of [id:
> 0x13ae1d03, /172.17.0.1:34827 :> /172.17.0.23:6123] DISCONNECTED
> 2021-08-27 09:17:10,008 DEBUG akka.remote.transport.ProtocolStateActor [] -
> Association between local [tcp://flink@cluster:6123] and remote
> [tcp://[email protected]:34827] was disassociated because the
> ProtocolStateActor failed: Unknown
> 2021-08-27 09:17:10,009 WARN akka.remote.ReliableDeliverySupervisor [] -
> Association with remote system [akka.tcp://[email protected]:6122] has
> failed, address is now gated for [50] ms. Reason: [Disassociated]
> {quote}
> TaskManager:
> {quote}INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager__, retrying
> in 10000 ms: Could not connect to rpc endpoint under address
> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager__.
> INFO org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager__, retrying
> in 10000 ms: Could not connect to rpc endpoint under address
> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager__.
> {quote}
> Best regards,
> Julio
--
This message was sent by Atlassian Jira
(v8.20.10#820010)