Hi Robert,
I found the cause, it was due to bug in job itself - code after 
streamEnv.execute(...) called System.exit(0), it was un-noticeable before 1.11, 
but with 1.11, I guess in Application Mode, main is called from job manager 
directly, and System.exit(0) just exits whole JVM.

Thank you and sorry for unnecessary noise
Alexey

________________________________
From: Robert Metzger <rmetz...@apache.org>
Sent: Tuesday, July 28, 2020 10:38:42 PM
To: Alexey Trenikhun <yen...@msn.com>
Cc: Flink User Mail List <user@flink.apache.org>
Subject: Re: Flink 1.11.1 - job manager exists with exit code 0

Hey Alexey,

What is the exit code of the JobManager? Can you check if it has been killed by 
the OOM killer?
You could also try to run the job with DEBUG log level, it might give us an 
additional indication why the JVM dies.
What kind of job are you submitting? Is it complicated?

On Sat, Jul 25, 2020 at 6:43 AM Alexey Trenikhun 
<yen...@msn.com<mailto:yen...@msn.com>> wrote:
Hello,

I've Flink 1.11.1 session cluster running via docker compose, I upload job jar, 
when I submit job jobmanager exits without any errors in log:

...
{"@timestamp":"2020-07-25T04:32:54.007Z","@version":"1","message":"Starting 
execution of job katana-fsp (64ff3943fdc5024c5beef1612518c627) under job master 
id 
00000000000000000000000000000000.","logger_name":"org.apache.flink.runtime.jobmaster.JobMaster","thread_name":"flink-akka.actor.default-dispatcher-18","level":"INFO","level_value":20000}
{"@timestamp":"2020-07-25T04:32:54.011Z","@version":"1","message":"Stopped BLOB 
server at 
0.0.0.0:6124<http://0.0.0.0:6124>","logger_name":"org.apache.flink.runtime.blob.BlobServer","thread_name":"BlobServer
 shutdown hook","level":"INFO","level_value":20000}
{"@timestamp":"2020-07-25T04:32:54.015Z","@version":"1","message":"Starting 
scheduling with scheduling strategy 
[org.apache.flink.runtime.scheduler.strategy.EagerSchedulingStrategy]","logger_name":"org.apache.flink.runtime.jobmaster.JobMaster","thread_name":"flink-akka.actor.default-dispatcher-18","level":"INFO","level_value":20000}
{"@timestamp":"2020-07-25T04:32:54.016Z","@version":"1","message":"Job 
katana-fsp (64ff3943fdc5024c5beef1612518c627) switched from state CREATED to 
RUNNING.","logger_name":"org.apache.flink.runtime.executiongraph.ExecutionGraph","thread_name":"flink-akka.actor.default-dispatcher-18","level":"INFO","level_value":20000}

Any ideas how to diagnose it?

Thanks,
Alexey

Reply via email to