[
https://issues.apache.org/jira/browse/FLINK-11127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773917#comment-16773917
]
Alex commented on FLINK-11127:
------------------------------
[~spoganshev], [~ray365], there is a simpler method to set configuration
settings (in particular {{taskmanager.host}}):
Some main classes in Flink (for starting JM and TM services) allow passing
optional arguments in form {{-Dconfig-key=config-value}}. This would override
the corresponding {{config-key}} in {{flink-conf.yaml}}. *Side note:* this
optional arguments made in similar style as JVMs {{-Dsome-option}} but can
appear after the main class.
Further more, the official [Flink docker images|https://hub.docker.com/_/flink]
would allow you to pass through additional command line arguments as container
arguments.
So, in concrete this case, with official Flink docker images and example
[Kubernetes
templates|https://github.com/apache/flink/tree/master/flink-container/kubernetes],
you can just modify TMs deployment template definition:
{noformat}
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: flink-task-manager
spec:
replicas: ${FLINK_JOB_PARALLELISM}
template:
metadata:
labels:
app: flink
component: task-manager
spec:
containers:
- name: flink-task-manager
image: ${FLINK_IMAGE_NAME}
args: ["task-manager", "-Djobmanager.rpc.address=flink-job-cluster",
"-Dtaskmanager.host=$(K8S_POD_IP)"] # <<< additional new command line arg
env:
- name: K8S_POD_IP # <<< env variable definition, from K8s' downward api
valueFrom:
fieldRef:
fieldPath: status.podIP
{noformat}
> Make metrics query service establish connection to JobManager
> -------------------------------------------------------------
>
> Key: FLINK-11127
> URL: https://issues.apache.org/jira/browse/FLINK-11127
> Project: Flink
> Issue Type: Improvement
> Components: Distributed Coordination, Kubernetes, Metrics
> Affects Versions: 1.7.0
> Reporter: Ufuk Celebi
> Priority: Major
>
> As part of FLINK-10247, the internal metrics query service has been separated
> into its own actor system. Before this change, the JobManager (JM) queried
> TaskManager (TM) metrics via the TM actor. Now, the JM needs to establish a
> separate connection to the TM metrics query service actor.
> In the context of Kubernetes, this is problematic as the JM will typically
> *not* be able to resolve the TMs by name, resulting in warnings as follows:
> {code}
> 2018-12-11 08:32:33,962 WARN akka.remote.ReliableDeliverySupervisor
> - Association with remote system
> [akka.tcp://flink-metrics@flink-task-manager-64b868487c-x9l4b:39183] has
> failed, address is now gated for [50] ms. Reason: [Association failed with
> [akka.tcp://flink-metrics@flink-task-manager-64b868487c-x9l4b:39183]] Caused
> by: [flink-task-manager-64b868487c-x9l4b: Name does not resolve]
> {code}
> In order to expose the TMs by name in Kubernetes, users require a service
> *for each* TM instance which is not practical.
> This currently results in the web UI not being to display some basic metrics
> about number of sent records. You can reproduce this by following the READMEs
> in {{flink-container/kubernetes}}.
> This worked before, because the JM is typically exposed via a service with a
> known name and the TMs establish the connection to it which the metrics query
> service piggybacked on.
> A potential solution to this might be to let the query service connect to the
> JM similar to how the TMs register.
> I tagged this ticket as an improvement, but in the context of Kubernetes I
> would consider this to be a bug.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)