Thanks for all the help, people - you made me go through my code once again and 
discover that I switched argument positions for job manager and resource 
manager addresses :-)

The docker ensemble now starts fine, I’m working on ironing out the bugs now.

I’ll participate in the survey too!

> On Aug 21, 2019, at 7:18 PM, Zili Chen <wander4...@gmail.com> wrote:
> 
> Besides, would you like to participant our survey thread[1] on
> user list about "How do you use high-availability services in Flink?"
> 
> It would help Flink improve its high-availability serving.
> 
> Best,
> tison.
> 
> [1] 
> https://lists.apache.org/x/thread.html/c0cc07197e6ba30b45d7709cc9e17d8497e5e3f33de504d58dfcafad@%3Cuser.flink.apache.org%3E
>  
> <https://lists.apache.org/x/thread.html/c0cc07197e6ba30b45d7709cc9e17d8497e5e3f33de504d58dfcafad@%3Cuser.flink.apache.org%3E>
> 
> Zili Chen <wander4...@gmail.com <mailto:wander4...@gmail.com>> 于2019年8月22日周四 
> 上午10:16写道:
> Hi Aleksandar,
> 
> base on your log:
> 
> taskmanager_1   | 2019-08-22 00:05:03,713 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting to 
> ResourceManager 
> akka.tcp://flink@jobmanager:6123/user/jobmanager(00000000000000000000000000000000)
>  <>.
> taskmanager_1   | 2019-08-22 00:05:04,137 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@jobmanager:6123/user/jobmanager <>, retrying in 10000 ms: 
> Could not connect to rpc endpoint under address 
> akka.tcp://flink@jobmanager:6123/user/jobmanager <>..
> 
> it looks like you return a jobmanager address on retrieval service of 
> resource manager. Please check the implementation carefully or share it on 
> mailing list that others can help for investigation.
> 
> Best,
> tison.
> 
> 
> Zhu Zhu <reed...@gmail.com <mailto:reed...@gmail.com>> 于2019年8月22日周四 
> 上午10:11写道:
> Hi Aleksandar,
> 
> The resource manager address is retrieved from the HA services.
> Would you check whether your customized HA services is returning the right  
> LeaderRetrievalService and whether the LeaderRetrievalService is really 
> retrieving the right leader's address?
> Or is it possible that the stored resource manager address in HA is replaced 
> by jobmanager address in any case?
> 
> Thanks,
> Zhu Zhu
> 
> Aleksandar Mastilovic <amastilo...@sightmachine.com 
> <mailto:amastilo...@sightmachine.com>> 于2019年8月22日周四 上午8:16写道:
> Hi all,
> 
> I’m experimenting with using my own implementation of HA services instead of 
> ZooKeeper that would persist JobManager information on a Kubernetes volume 
> instead of in ZooKeeper.
> 
> I’ve set the high-availability option in flink-conf.yaml to the FQN of my 
> factory class, and started the docker ensemble as I usually do (i.e. with no 
> special “cluster” arguments or scripts.)
> 
> What’s happening now is that TaskManager is unable to connect to 
> ResourceManager, because it seems it’s trying to use the /user/jobmanager 
> path instead of /user/resourcemanager.
> 
> Here’s what I found in the logs:
> 
> 
> jobmanager_1    | 2019-08-22 00:05:00,963 INFO  akka.remote.Remoting          
>                                 - Remoting started; listening on addresses 
> :[akka.tcp://flink@jobmanager:6123 <>]
> jobmanager_1    | 2019-08-22 00:05:00,975 INFO  
> org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils         - Actor system 
> started at akka.tcp://flink@jobmanager:6123 <>
> 
> jobmanager_1    | 2019-08-22 00:05:02,380 INFO  
> org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC 
> endpoint for 
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at 
> akka://flink/user/resourcemanager <> .
> 
> jobmanager_1    | 2019-08-22 00:05:03,138 INFO  
> org.apache.flink.runtime.rpc.akka.AkkaRpcService              - Starting RPC 
> endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at 
> akka://flink/user/dispatcher <> .
> 
> jobmanager_1    | 2019-08-22 00:05:03,211 INFO  
> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager  - 
> ResourceManager akka.tcp://flink@jobmanager:6123/user/resourcemanager <> was 
> granted leadership with fencing token 00000000000000000000000000000000
> 
> jobmanager_1    | 2019-08-22 00:05:03,292 INFO  
> org.apache.flink.runtime.dispatcher.StandaloneDispatcher      - Dispatcher 
> akka.tcp://flink@jobmanager:6123/user/dispatcher <> was granted leadership 
> with fencing token 00000000-0000-0000-0000-000000000000
> 
> taskmanager_1   | 2019-08-22 00:05:03,713 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Connecting to 
> ResourceManager 
> akka.tcp://flink@jobmanager:6123/user/jobmanager(00000000000000000000000000000000)
>  <>.
> taskmanager_1   | 2019-08-22 00:05:04,137 INFO  
> org.apache.flink.runtime.taskexecutor.TaskExecutor            - Could not 
> resolve ResourceManager address 
> akka.tcp://flink@jobmanager:6123/user/jobmanager <>, retrying in 10000 ms: 
> Could not connect to rpc endpoint under address 
> akka.tcp://flink@jobmanager:6123/user/jobmanager <>..
> 
> Is this a known bug? I’d appreciate any help I can get.
> 
> Thanks,
> Aleksandar Mastilovic

Reply via email to