Hi Navneeth

First of all, I suggest to upgrade Flink version to latest version.
And you could refer here [1] for the savepoint compatibility when upgrading 
Flink.

For the problem that cannot connect address, you could login your pod and run 
'nslookup jobmanager' to see whether the host could be resolved.
You can also check the service of 'jobmanager' whether work as expected via 
'kubectl get svc' .

[1] 
https://ci.apache.org/projects/flink/flink-docs-stable/ops/upgrading.html#compatibility-table

Best
Yun Tang

________________________________
From: Navneeth Krishnan <[email protected]>
Sent: Friday, August 28, 2020 17:00
To: user <[email protected]>
Subject: Flink Migration

Hi All,

We are currently on a very old version of flink 1.4.0 and it has worked pretty 
well. But lately we have been facing checkpoint timeout issues. We would like 
to minimize any changes to the current pipelines and go ahead with the 
migration. With that said our first pick was to migrate to 1.5.6 and later 
migrate to a newer version.

Do you guys think a more recent version like 1.6 or 1.7 might work? We did try 
1.8 but it requires some changes in the pipelines.

When we tried 1.5.6 with docker compose we were unable to get the task manager 
attached to jobmanager. Are there some specific configurations required for 
newer versions?

Logs:

8-28 07:36:30.834 [main] INFO  
org.apache.flink.runtime.util.LeaderRetrievalUtils  - TaskManager will try to 
connect for 10000 milliseconds before falling back to heuristics

2020-08-28 07:36:30.853 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Retrieved new target address 
jobmanager/172.21.0.8:6123<http://172.21.0.8:6123>.

2020-08-28 07:36:31.279 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to address 
jobmanager/172.21.0.8:6123<http://172.21.0.8:6123>

2020-08-28 07:36:31.280 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'e6f9104cdc61/172.21.0.9<http://172.21.0.9>': Connection refused (Connection 
refused)

2020-08-28 07:36:31.281 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.281 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.282 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:31.283 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.284 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:31.684 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to address 
jobmanager/172.21.0.8:6123<http://172.21.0.8:6123>

2020-08-28 07:36:31.686 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'e6f9104cdc61/172.21.0.9<http://172.21.0.9>': Connection refused (Connection 
refused)

2020-08-28 07:36:31.687 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.688 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.688 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:31.689 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:31.690 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:32.490 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to address 
jobmanager/172.21.0.8:6123<http://172.21.0.8:6123>

2020-08-28 07:36:32.491 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'e6f9104cdc61/172.21.0.9<http://172.21.0.9>': Connection refused (Connection 
refused)

2020-08-28 07:36:32.493 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:32.494 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:32.495 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:32.496 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/172.21.0.9<http://172.21.0.9>': Connection refused (Connection refused)

2020-08-28 07:36:32.497 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Failed to connect from address 
'/127.0.0.1<http://127.0.0.1>': Invalid argument (connect failed)

2020-08-28 07:36:34.099 [main] INFO  
org.apache.flink.runtime.net.ConnectionUtils  - Trying to connect to address 
jobmanager/172.21.0.8:6123<http://172.21.0.8:6123>

2020-08-28 07:36:34.100 [main] INFO  
org.apache.flink.runtime.taskexecutor.TaskManagerRunner  - TaskManager will use 
hostname/address 'e6f9104cdc61' (172.21.0.9) for communication.


Flink Conf

jobmanager.rpc.address: jobmanager

rest.address: jobmanager


Thanks

Reply via email to