Great to hear that it works on K8s and letting us know that the problem is
likely to be caused by Minikube.

Cheers,
Till

On Fri, Sep 4, 2020 at 8:53 AM superainbower <superainbo...@163.com> wrote:

> Hi Till & Yang,
> I can deploy Flink on kubernetes(not minikube), it works well
> So there are some problem about my minikube but I can’t find and fix it
> Anyway I can deploy on k8s now
> Thanks for your help!
> superainbower
> superainbo...@163.com
>
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=superainbower&uid=superainbower%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22superainbower%40163.com%22%5D>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>
> On 09/3/2020 15:47,Till Rohrmann<trohrm...@apache.org>
> <trohrm...@apache.org> wrote:
>
> In order to exclude a Minikube problem, you could also try to run Flink on
> an older Minikube and an older K8s version. Our end-to-end tests use
> Minikube v1.8.2, for example.
>
> Cheers,
> Till
>
> On Thu, Sep 3, 2020 at 8:44 AM Yang Wang <danrtsey...@gmail.com> wrote:
>
>> Sorry i forget that the JobManager is binding its rpc address to
>> flink-jobmanager, not the ip address.
>> So you need to also update the jobmanager-session-deployment.yaml with
>> following changes.
>>
>> ...
>>       containers:
>>       - name: jobmanager
>>         env:
>>         - name: JM_IP
>>           valueFrom:
>>             fieldRef:
>>               apiVersion: v1
>>               fieldPath: status.podIP
>>         image: flink:1.11
>>         args: ["jobmanager", "$(JM_IP)"]
>> ...
>>
>> After then the JobManager is binding the rpc address with its ip.
>>
>> Best,
>> Yang
>>
>>
>> superainbower <superainbo...@163.com> 于2020年9月3日周四 上午11:38写道:
>>
>>> HI Yang,
>>> I update taskmanager-session-deployment.yaml like this:
>>>
>>> apiVersion: apps/v1
>>> kind: Deployment
>>> metadata:
>>>   name: flink-taskmanager
>>> spec:
>>>   replicas: 1
>>>   selector:
>>>     matchLabels:
>>>       app: flink
>>>       component: taskmanager
>>>   template:
>>>     metadata:
>>>       labels:
>>>         app: flink
>>>         component: taskmanager
>>>     spec:
>>>       containers:
>>>       - name: taskmanager
>>>         image:
>>> registry.cn-hangzhou.aliyuncs.com/superainbower/flink:1.11.1
>>>         args: ["taskmanager","-Djobmanager.rpc.address=172.18.0.5"]
>>>         ports:
>>>         - containerPort: 6122
>>>           name: rpc
>>>         - containerPort: 6125
>>>           name: query-state
>>>         livenessProbe:
>>>           tcpSocket:
>>>             port: 6122
>>>           initialDelaySeconds: 30
>>>           periodSeconds: 60
>>>         volumeMounts:
>>>         - name: flink-config-volume
>>>           mountPath: /opt/flink/conf/
>>>         securityContext:
>>>           runAsUser: 9999  # refers to user _flink_ from official flink
>>> image, change if necessary
>>>       volumes:
>>>       - name: flink-config-volume
>>>         configMap:
>>>           name: flink-config
>>>           items:
>>>           - key: flink-conf.yaml
>>>             path: flink-conf.yaml
>>>           - key: log4j-console.properties
>>>             path: log4j-console.properties
>>>       imagePullSecrets:
>>>         - name: regcred
>>>
>>> And Delete the TaskManager pod and restart it , but the logs print this
>>>
>>> Could not resolve ResourceManager address akka.tcp://
>>> flink@172.18.0.5:6123/user/rpc/resourcemanager_*, retrying in 10000 ms:
>>> Could not connect to rpc endpoint under address akka.tcp://
>>> flink@172.18.0.5:6123/user/rpc/resourcemanager_*
>>>
>>> It change flink-jobmanager to 172.18.0.5
>>> superainbower
>>> superainbo...@163.com
>>>
>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=superainbower&uid=superainbower%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22superainbower%40163.com%22%5D>
>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>
>>> On 09/3/2020 11:09,Yang Wang<danrtsey...@gmail.com>
>>> <danrtsey...@gmail.com> wrote:
>>>
>>> I guess something is wrong with your kube proxy, which causes
>>> TaskManager could not connect to JobManager.
>>> You could verify this by directly using JobManager Pod ip instead of
>>> service name.
>>>
>>> Please do as follows.
>>> * Edit the TaskManager deployment(via kubectl edit flink-taskmanager)
>>> and update the args field to the following.
>>>    args: ["taskmanager", "-Djobmanager.rpc.address=172.18.0.5"]
>>> Given that "172.18.0.5" is the JobManager pod ip.
>>> * Delete the current TaskManager pod and let restart again
>>> * Now check the TaskManager logs to check whether it could register
>>> successfully
>>>
>>>
>>>
>>> Best,
>>> Yang
>>>
>>> superainbower <superainbo...@163.com> 于2020年9月3日周四 上午9:35写道:
>>>
>>>> Hi Till,
>>>> I find something may be helpful.
>>>> The kubernetes Dashboard show job-manager ip 172.18.0.5, task-manager
>>>> ip 172.18.0.6
>>>> When I run command 'kubectl exec -ti flink-taskmanager-74c68c6f48-jqpbn
>>>> -- /bin/bash’ && ‘ping 172.18.0.5’
>>>> I can get response
>>>> But when I ping flink-jobmanager ,there is no response
>>>>
>>>> superainbower
>>>> superainbo...@163.com
>>>>
>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=superainbower&uid=superainbower%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22superainbower%40163.com%22%5D>
>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>>
>>>> On 09/3/2020 09:03,superainbower<superainbo...@163.com>
>>>> <superainbo...@163.com> wrote:
>>>>
>>>> Hi Till,
>>>> This is the taskManager log
>>>> As you see, the logs print  ‘line 92 -- Could not connect to
>>>> flink-jobmanager:6123’
>>>> then print ‘line 128 --Could not resolve ResourceManager address
>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*,
>>>> retrying in 10000 ms: Could not connect to rpc endpoint under address
>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.’
>>>> And repeat print this
>>>>
>>>> A few minutes later, the taskmanger shut down and restart
>>>>
>>>> This is my yaml files, could u help me to confirm did I
>>>> omitted something? Thanks a lot!
>>>> ---------------------------------------------------
>>>> flink-configuration-configmap.yaml
>>>> apiVersion: v1
>>>> kind: ConfigMap
>>>> metadata:
>>>>   name: flink-config
>>>>   labels:
>>>>     app: flink
>>>> data:
>>>>   flink-conf.yaml: |+
>>>>     jobmanager.rpc.address: flink-jobmanager
>>>>     taskmanager.numberOfTaskSlots: 1
>>>>     blob.server.port: 6124
>>>>     jobmanager.rpc.port: 6123
>>>>     taskmanager.rpc.port: 6122
>>>>     queryable-state.proxy.ports: 6125
>>>>     jobmanager.memory.process.size: 1024m
>>>>     taskmanager.memory.process.size: 1024m
>>>>     parallelism.default: 1
>>>>   log4j-console.properties: |+
>>>>     rootLogger.level = INFO
>>>>     rootLogger.appenderRef.console.ref = ConsoleAppender
>>>>     rootLogger.appenderRef.rolling.ref = RollingFileAppender
>>>>     logger.akka.name = akka
>>>>     logger.akka.level = INFO
>>>>     logger.kafka.name= org.apache.kafka
>>>>     logger.kafka.level = INFO
>>>>     logger.hadoop.name = org.apache.hadoop
>>>>     logger.hadoop.level = INFO
>>>>     logger.zookeeper.name = org.apache.zookeeper
>>>>     logger.zookeeper.level = INFO
>>>>     appender.console.name = ConsoleAppender
>>>>     appender.console.type = CONSOLE
>>>>     appender.console.layout.type = PatternLayout
>>>>     appender.console.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p
>>>> %-60c %x - %m%n
>>>>     appender.rolling.name = RollingFileAppender
>>>>     appender.rolling.type = RollingFile
>>>>     appender.rolling.append = false
>>>>     appender.rolling.fileName = ${sys:log.file}
>>>>     appender.rolling.filePattern = ${sys:log.file}.%i
>>>>     appender.rolling.layout.type = PatternLayout
>>>>     appender.rolling.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p
>>>> %-60c %x - %m%n
>>>>     appender.rolling.policies.type = Policies
>>>>     appender.rolling.policies.size.type = SizeBasedTriggeringPolicy
>>>>     appender.rolling.policies.size.size=100MB
>>>>     appender.rolling.strategy.type = DefaultRolloverStrategy
>>>>     appender.rolling.strategy.max = 10
>>>>     logger.netty.name =
>>>> org.apache.flink.shaded.akka.org.jboss.netty.channel.DefaultChannelPipeline
>>>>     logger.netty.level = OFF
>>>> ---------------------------------------------------
>>>> jobmanager-service.yaml
>>>> apiVersion: v1
>>>> kind: Service
>>>> metadata:
>>>>   name: flink-jobmanager
>>>> spec:
>>>>   type: ClusterIP
>>>>   ports:
>>>>   - name: rpc
>>>>     port: 6123
>>>>   - name: blob-server
>>>>     port: 6124
>>>>   - name: webui
>>>>     port: 8081
>>>>   selector:
>>>>     app: flink
>>>>     component: jobmanager
>>>> --------------------------------------------------
>>>> jobmanager-session-deployment.yaml
>>>> apiVersion: apps/v1
>>>> kind: Deployment
>>>> metadata:
>>>>   name: flink-jobmanager
>>>> spec:
>>>>   replicas: 1
>>>>   selector:
>>>>     matchLabels:
>>>>       app: flink
>>>>       component: jobmanager
>>>>   template:
>>>>     metadata:
>>>>       labels:
>>>>         app: flink
>>>>         component: jobmanager
>>>>     spec:
>>>>       containers:
>>>>       - name: jobmanager
>>>>         image:
>>>> registry.cn-hangzhou.aliyuncs.com/superainbower/flink:1.11.1
>>>>         args: ["jobmanager"]
>>>>         ports:
>>>>         - containerPort: 6123
>>>>           name: rpc
>>>>         - containerPort: 6124
>>>>           name: blob-server
>>>>         - containerPort: 8081
>>>>           name: webui
>>>>         livenessProbe:
>>>>           tcpSocket:
>>>>             port: 6123
>>>>           initialDelaySeconds: 30
>>>>           periodSeconds: 60
>>>>         volumeMounts:
>>>>         - name: flink-config-volume
>>>>           mountPath: /opt/flink/conf
>>>>         securityContext:
>>>>           runAsUser: 9999  # refers to user _flink_ from official flink
>>>> image, change if necessary
>>>>       volumes:
>>>>       - name: flink-config-volume
>>>>         configMap:
>>>>           name: flink-config
>>>>           items:
>>>>           - key: flink-conf.yaml
>>>>             path: flink-conf.yaml
>>>>           - key: log4j-console.properties
>>>>             path: log4j-console.properties
>>>>       imagePullSecrets:
>>>>         - name: regcred
>>>> ---------------------------------------------------
>>>> taskmanager-session-deployment.yaml
>>>> apiVersion: apps/v1
>>>> kind: Deployment
>>>> metadata:
>>>>   name: flink-taskmanager
>>>> spec:
>>>>   replicas: 1
>>>>   selector:
>>>>     matchLabels:
>>>>       app: flink
>>>>       component: taskmanager
>>>>   template:
>>>>     metadata:
>>>>       labels:
>>>>         app: flink
>>>>         component: taskmanager
>>>>     spec:
>>>>       containers:
>>>>       - name: taskmanager
>>>>         image:
>>>> registry.cn-hangzhou.aliyuncs.com/superainbower/flink:1.11.1
>>>>         args: ["taskmanager"]
>>>>         ports:
>>>>         - containerPort: 6122
>>>>           name: rpc
>>>>         - containerPort: 6125
>>>>           name: query-state
>>>>         livenessProbe:
>>>>           tcpSocket:
>>>>             port: 6122
>>>>           initialDelaySeconds: 30
>>>>           periodSeconds: 60
>>>>         volumeMounts:
>>>>         - name: flink-config-volume
>>>>           mountPath: /opt/flink/conf/
>>>>         securityContext:
>>>>           runAsUser: 9999  # refers to user _flink_ from official flink
>>>> image, change if necessary
>>>>       volumes:
>>>>       - name: flink-config-volume
>>>>         configMap:
>>>>           name: flink-config
>>>>           items:
>>>>           - key: flink-conf.yaml
>>>>             path: flink-conf.yaml
>>>>           - key: log4j-console.properties
>>>>             path: log4j-console.properties
>>>>       imagePullSecrets:
>>>>         - name: regcred
>>>>
>>>>
>>>> superainbower
>>>> superainbo...@163.com
>>>>
>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=superainbower&uid=superainbower%40163.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22superainbower%40163.com%22%5D>
>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>>
>>>> On 09/2/2020 20:38,Till Rohrmann<trohrm...@apache.org>
>>>> <trohrm...@apache.org> wrote:
>>>>
>>>> Hmm, this is indeed strange. Could you share the logs of the
>>>> TaskManager with us? Ideally you set the log level to debug. Thanks a lot.
>>>>
>>>> Cheers,
>>>> Till
>>>>
>>>> On Wed, Sep 2, 2020 at 12:45 PM art <superainbo...@163.com> wrote:
>>>>
>>>>> Hi Till,
>>>>>
>>>>> The full information when I run command ' kubectl get all’  like this:
>>>>>
>>>>> NAME                                     READY   STATUS    RESTARTS
>>>>> AGE
>>>>> pod/flink-jobmanager-85bdbd98d8-ppjmf    1/1     Running   0
>>>>>  2m34s
>>>>> pod/flink-taskmanager-74c68c6f48-6jb5v   1/1     Running   0
>>>>>  2m34s
>>>>>
>>>>> NAME                       TYPE        CLUSTER-IP      EXTERNAL-IP
>>>>> PORT(S)                      AGE
>>>>> service/flink-jobmanager   ClusterIP   10.103.207.75   <none>
>>>>>  6123/TCP,6124/TCP,8081/TCP   2m34s
>>>>> service/kubernetes         ClusterIP   10.96.0.1       <none>
>>>>>  443/TCP                      5d2h
>>>>>
>>>>> NAME                                READY   UP-TO-DATE   AVAILABLE
>>>>> AGE
>>>>> deployment.apps/flink-jobmanager    1/1     1            1
>>>>> 2m34s
>>>>> deployment.apps/flink-taskmanager   1/1     1            1
>>>>> 2m34s
>>>>>
>>>>> NAME                                           DESIRED   CURRENT
>>>>> READY   AGE
>>>>> replicaset.apps/flink-jobmanager-85bdbd98d8    1         1         1
>>>>>     2m34s
>>>>> replicaset.apps/flink-taskmanager-74c68c6f48   1         1         1
>>>>>     2m34s
>>>>>
>>>>> And I can open flink ui but the task manger is 0 ,so the job manger is
>>>>> work well
>>>>> I think the problem is taksmanger can not register itself to
>>>>> jobmanger,  did I miss some configure?
>>>>>
>>>>>
>>>>> 在 2020年9月2日,下午5:24,Till Rohrmann <trohrm...@apache.org> 写道:
>>>>>
>>>>> Hi art,
>>>>>
>>>>> could you check what `kubectl get services` returns? Usually if you
>>>>> run `kubectl get all` you should also see the services. But in your case
>>>>> there are no services listed. You have see something like
>>>>> service/flink-jobmanager otherwise the flink-jobmanager service (K8s
>>>>> service) is not running.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Wed, Sep 2, 2020 at 11:15 AM art <superainbo...@163.com> wrote:
>>>>>
>>>>>> Hi Till,
>>>>>>
>>>>>> I’m sure the job manager-service is started, I can find it in
>>>>>> Kubernetes DashBoard
>>>>>>
>>>>>> When I run command ' kubectl get deployment’ I can got this:
>>>>>> flink-jobmanager    1/1     1            1           33s
>>>>>> flink-taskmanager   1/1     1            1           33s
>>>>>>
>>>>>> When I run command ' kubectl get all’ I can got this:
>>>>>> NAME                                     READY   STATUS    RESTARTS
>>>>>> AGE
>>>>>> pod/flink-jobmanager-85bdbd98d8-ppjmf    1/1     Running   0
>>>>>>  2m34s
>>>>>> pod/flink-taskmanager-74c68c6f48-6jb5v   1/1     Running   0
>>>>>>  2m34s
>>>>>>
>>>>>> So, I think flink-jobmanager works well, but taskmannger is restarted
>>>>>> every few minutes
>>>>>>
>>>>>> My minikube version: v1.12.3
>>>>>> Flink version:v1.11.1
>>>>>>
>>>>>> 在 2020年9月2日,下午4:27,Till Rohrmann <trohrm...@apache.org> 写道:
>>>>>>
>>>>>> Hi art,
>>>>>>
>>>>>> could you verify that the jobmanager-service has been started? It
>>>>>> looks as if the name flink-jobmanager is not resolvable. It could also 
>>>>>> help
>>>>>> to know the Minikube and K8s version you are using.
>>>>>>
>>>>>> Cheers,
>>>>>> Till
>>>>>>
>>>>>> On Wed, Sep 2, 2020 at 9:50 AM art <superainbo...@163.com> wrote:
>>>>>>
>>>>>>> Hi,I’m going to deploy flink on minikube referring to
>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.11/zh/ops/deployment/kubernetes.html
>>>>>>> ;
>>>>>>> kubectl create -f flink-configuration-configmap.yaml
>>>>>>> kubectl create -f jobmanager-service.yaml
>>>>>>> kubectl create -f jobmanager-session-deployment.yaml
>>>>>>> kubectl create -f taskmanager-session-deployment.yaml
>>>>>>>
>>>>>>> But I got this
>>>>>>>
>>>>>>> 2020-09-02 06:45:42,664 WARN  akka.remote.ReliableDeliverySupervisor
>>>>>>>                       [] - Association with remote system [
>>>>>>> akka.tcp://flink@flink-jobmanager:6123] has failed, address is now
>>>>>>> gated for [50] ms. Reason: [Association failed with [
>>>>>>> akka.tcp://flink@flink-jobmanager:6123]] Caused by:
>>>>>>> [java.net.UnknownHostException: flink-jobmanager: Temporary failure in 
>>>>>>> name
>>>>>>> resolution]
>>>>>>> 2020-09-02 06:45:42,691 INFO
>>>>>>>  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could
>>>>>>> not resolve ResourceManager address
>>>>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*,
>>>>>>> retrying in 10000 ms: Could not connect to rpc endpoint under address
>>>>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.
>>>>>>> 2020-09-02 06:46:02,731 INFO
>>>>>>>  org.apache.flink.runtime.taskexecutor.TaskExecutor           [] - Could
>>>>>>> not resolve ResourceManager address
>>>>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*,
>>>>>>> retrying in 10000 ms: Could not connect to rpc endpoint under address
>>>>>>> akka.tcp://flink@flink-jobmanager:6123/user/rpc/resourcemanager_*.
>>>>>>> 2020-09-02 06:46:12,731 INFO
>>>>>>>  akka.remote.transport.ProtocolStateActor                     [] - No
>>>>>>> response from remote for outbound association. Associate timed out after
>>>>>>> [20000 ms].
>>>>>>>
>>>>>>> And when I run the command 'kubectl exec -ti
>>>>>>> flink-taskmanager-74c68c6f48-9tkvd -- /bin/bash’ && ‘ping 
>>>>>>> flink-jobmanager’
>>>>>>> , I find I cannot ping flink-jobmanager from taskmanager
>>>>>>>
>>>>>>> I am new to k8s, can anyone give me some tutorial? Thanks a lot !
>>>>>>>
>>>>>>
>>>>>>
>>>>>

Reply via email to