Hi Yang
Thanks again for your help so far.
I tried your suggestion, still with no luck.
Attached are the logs, please let me know if there are more I should send.
Best
kevin
On 2020/06/08 03:02:40, Yang Wang <[email protected]<mailto:[email protected]>> wrote:
> Hi Kevin,>
>
> It may because the characters length limitation of K8s(no more than 63)[1].>
> So the pod>
> name could not be too long. I notice that you are using the client>
> automatic generated>
> cluster-id. It may cause problem and could you set a meaningful cluster-id>
> for your Flink>
> session? For example,>
>
> kubernetes-session.sh ... -Dkubernetes.cluster-id=my-flink-k8s-session>
>
> This behavior has been improved in Flink 1.11 to check the length in client>
> side before submission.>
>
> If it still could not work, could you share your full command and>
> jobmanager logs? It will help a lot>
> to find the root cause.>
>
>
> [1].>
> https://stackoverflow.com/questions/50412837/kubernetes-label-name-63-character-limit>
>
>
> Best,>
> Yang>
>
> kb <[email protected]<mailto:[email protected]>> 于2020年6月6日周六 上午1:00写道:>
>
> > Thanks Yang for the suggestion, I have tried it and I'm still getting the>
> > same exception. Is it possible its due to the null pod name? Operation:>
> > [create] for kind: [Pod] with name: [null] in namespace: [default]>
> > failed.>
> >>
> > Best,>
> > kevin>
> >>
> >>
> >>
> > -->
> > Sent from:>
> > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/>
> >>
>
Best,
kevin
[REDACTED flink]$ kubectl create serviceaccount svc-flink
serviceaccount/svc-flink created
[REDACTED flink]$ kubectl create clusterrolebinding svc-flink-role-binding
--clusterrole=cluster-admin --serviceaccount=default:svc-flink
clusterrolebinding.rbac.authorization.k8s.io/svc-flink-role-binding created
[REDACTED flink]$ ./flink-1.10.1/bin/kubernetes-session.sh
-Dkubernetes.jobmanager.service-account=svc-flink
-Dcontainerized.master.env.HTTP2_DISABLE=true
-Dkubernetes.container.image=REDACTED/flink:1.10.1-scala_2.11-s3-3
-Dkubernetes.cluster-id=ledger-flink-session
2020-06-08 14:50:57,215 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, localhost
2020-06-08 14:50:57,216 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-08 14:50:57,217 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-08 14:50:57,217 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.memory.process.size, 1728m
2020-06-08 14:50:57,217 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-08 14:50:57,217 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-08 14:50:57,218 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-08 14:50:58,542 INFO
org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The
derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less
than its min value 192.000mb (201326592 bytes), min value will be used instead
2020-06-08 14:50:58,550 INFO org.apache.flink.kubernetes.utils.KubernetesUtils
- Kubernetes deployment requires a fixed port. Configuration
blob.server.port will be set to 6124
2020-06-08 14:50:58,551 INFO org.apache.flink.kubernetes.utils.KubernetesUtils
- Kubernetes deployment requires a fixed port. Configuration
taskmanager.rpc.port will be set to 6122
2020-06-08 14:50:59,532 INFO
org.apache.flink.kubernetes.KubernetesClusterDescriptor - Create flink
session cluster ledger-flink-session successfully, JobManager Web Interface:
REDACTED
[REDACTED flink]$ kubectl get services | fgrep flink
ledger-flink-session ClusterIP REDACTED <none>
8081/TCP,6123/TCP,6124/TCP 24s
ledger-flink-session-rest LoadBalancer REDACTED <pending>
8081:32379/TCP 24s
[REDACTED flink]$ kubectl get pods | fgrep flink
ledger-flink-session-7bf95b68b5-tsfw4 1/1 Running 0 6s
[REDACTED flink]$ nohup kubectl port-forward service/ledger-flink-session 8081 &
[1] 16722
[REDACTED flink]$ nohup: ignoring input and appending output to ânohup.outâ
[REDACTED flink]$ kubectl exec -it ledger-flink-session-7bf95b68b5-tsfw4 bash
root@ledger-flink-session-7bf95b68b5-tsfw4:/opt/flink# cat log/jobmanager.log
2020-06-08 14:51:31,391 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
KubernetesSessionClusterEntrypoint (Version: 1.10.1, Rev:c5915cf,
Date:07.05.2020 @ 13:58:51 CST)
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS current
user: root
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current
Hadoop/Kerberos user: <no hadoop dependency found>
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM: OpenJDK
64-Bit Server VM - Oracle Corporation - 1.8/25.252-b09
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum heap
size: 409 MiBytes
2020-06-08 14:51:31,393 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JAVA_HOME:
/usr/local/openjdk-8
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop
Dependency available
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM Options:
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms424m
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx424m
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog.file=/opt/flink/log/jobmanager.log
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlogback.configurationFile=file:/opt/flink/conf/logback.xml
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog4j.configuration=file:/opt/flink/conf/log4j.properties
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program
Arguments: (none)
2020-06-08 14:51:31,394 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Classpath:
/opt/flink/lib/flink-table-blink_2.11-1.10.1.jar:/opt/flink/lib/flink-table_2.11-1.10.1.jar:/opt/flink/lib/log4j-1.2.17.jar:/opt/flink/lib/slf4j-log4j12-1.7.15.jar:/opt/flink/lib/flink-dist_2.11-1.10.1.jar:::
2020-06-08 14:51:31,395 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2020-06-08 14:51:31,396 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered UNIX
signal handlers for [TERM, HUP, INT]
2020-06-08 14:51:31,405 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: blob.server.port, 6124
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.memory.process.size, 1728m
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.internal.jobmanager.entrypoint.class,
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, ledger-flink-session.default
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.internal.service.id,
76db5ed3-a997-11ea-a79c-0238474ce95c
2020-06-08 14:51:31,406 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: execution.target, kubernetes-session
2020-06-08 14:51:31,407 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-08 14:51:31,407 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.cluster-id, ledger-flink-session
2020-06-08 14:51:31,407 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.rpc.port, 6122
2020-06-08 14:51:31,407 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: containerized.master.env.HTTP2_DISABLE, true
2020-06-08 14:51:31,407 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: internal.cluster.execution-mode, NORMAL
2020-06-08 14:51:31,408 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.container.image,
REDACTED/flink:1.10.1-scala_2.11-s3-3
2020-06-08 14:51:31,408 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-08 14:51:31,408 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-08 14:51:31,408 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.jobmanager.service-account, svc-flink
2020-06-08 14:51:31,408 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-08 14:51:31,611 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
KubernetesSessionClusterEntrypoint.
2020-06-08 14:51:31,611 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install default
filesystem.
2020-06-08 14:51:31,711 INFO org.apache.flink.core.fs.FileSystem
- Hadoop is not in the classpath/dependencies. The extended set of
supported File Systems via Hadoop is not available.
2020-06-08 14:51:31,795 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install
security context.
2020-06-08 14:51:31,804 INFO
org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot create
Hadoop Security Module because Hadoop cannot be found in the Classpath.
2020-06-08 14:51:31,812 INFO
org.apache.flink.runtime.security.modules.JaasModule - Jaas file will
be created as /tmp/jaas-4193678875133969299.conf.
2020-06-08 14:51:31,815 INFO org.apache.flink.runtime.security.SecurityUtils
- Cannot install HadoopSecurityContext because Hadoop cannot be
found in the Classpath.
2020-06-08 14:51:31,816 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Initializing
cluster services.
2020-06-08 14:51:31,886 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start
actor system at ledger-flink-session.default:6123
2020-06-08 14:51:32,968 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2020-06-08 14:51:32,993 INFO akka.remote.Remoting
- Starting remoting
2020-06-08 14:51:33,193 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://[email protected]:6123]
2020-06-08 14:51:33,299 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system
started at akka.tcp://[email protected]:6123
2020-06-08 14:51:33,393 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key 'jobmanager.rpc.address'
instead of key 'rest.address'
2020-06-08 14:51:33,469 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-a99afe14-0228-4ce4-908a-40916fed9148
2020-06-08 14:51:33,472 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:6124 - max concurrent requests: 50
- max backlog: 1000
2020-06-08 14:51:33,483 INFO
org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics
reporter configured, no metrics will be exposed/reported.
2020-06-08 14:51:33,486 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to start
actor system at ledger-flink-session.default:0
2020-06-08 14:51:33,567 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2020-06-08 14:51:33,576 INFO akka.remote.Remoting
- Starting remoting
2020-06-08 14:51:33,583 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://[email protected]:45069]
2020-06-08 14:51:33,595 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor system
started at akka.tcp://[email protected]:45069
2020-06-08 14:51:33,604 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.metrics.dump.MetricQueryService at
akka://flink-metrics/user/MetricQueryService .
2020-06-08 14:51:33,781 INFO
org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore -
Initializing FileArchivedExecutionGraphStore: Storage directory
/tmp/executionGraphStore-b4cce8ce-d743-4aeb-9d26-93dcfd6da8b8, expiration time
3600000, maximum cache size 52428800 bytes.
2020-06-08 14:51:33,824 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key 'jobmanager.rpc.address'
instead of key 'rest.address'
2020-06-08 14:51:33,825 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload
directory /tmp/flink-web-a2fb24fe-f8c0-4d79-bbd7-db57b8dd3796/flink-web-upload
does not exist.
2020-06-08 14:51:33,826 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created
directory /tmp/flink-web-a2fb24fe-f8c0-4d79-bbd7-db57b8dd3796/flink-web-upload
for file uploads.
2020-06-08 14:51:33,827 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting rest
endpoint.
2020-06-08 14:51:34,271 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component log file: /opt/flink/log/jobmanager.log
2020-06-08 14:51:34,271 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component stdout file: /opt/flink/log/jobmanager.out
2020-06-08 14:51:34,600 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest endpoint
listening at ledger-flink-session.default:8081
2020-06-08 14:51:34,601 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint -
http://ledger-flink-session.default:8081 was granted leadership with
leaderSessionID=00000000-0000-0000-0000-000000000000
2020-06-08 14:51:34,667 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web frontend
listening at http://ledger-flink-session.default:8081.
2020-06-08 14:51:35,284 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.kubernetes.KubernetesResourceManager at
akka://flink/user/resourcemanager .
2020-06-08 14:51:35,370 INFO
org.apache.flink.runtime.clusterframework.TaskExecutorProcessUtils - The
derived from fraction jvm overhead memory (172.800mb (181193935 bytes)) is less
than its min value 192.000mb (201326592 bytes), min value will be used instead
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: blob.server.port, 6124
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.memory.process.size, 1728m
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.internal.jobmanager.entrypoint.class,
org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.execution.failover-strategy, region
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.address, ledger-flink-session.default
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.internal.service.id,
76db5ed3-a997-11ea-a79c-0238474ce95c
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: execution.target, kubernetes-session
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.rpc.port, 6123
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.cluster-id, ledger-flink-session
2020-06-08 14:51:35,374 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.rpc.port, 6122
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: containerized.master.env.HTTP2_DISABLE, true
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: internal.cluster.execution-mode, NORMAL
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.container.image,
REDACTED/flink:1.10.1-scala_2.11-s3-3
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 1
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: kubernetes.jobmanager.service-account, svc-flink
2020-06-08 14:51:35,375 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 1024m
2020-06-08 14:51:35,396 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess -
Start SessionDispatcherLeaderProcess.
2020-06-08 14:51:35,399 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess -
Recover all persisted job graphs.
2020-06-08 14:51:35,400 INFO
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess -
Successfully recovered 0 persisted job graphs.
2020-06-08 14:51:35,405 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.dispatcher.StandaloneDispatcher at
akka://flink/user/dispatcher .
2020-06-08 14:51:37,071 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Recovered 0
pods from previous attempts, current attempt id is 1.
2020-06-08 14:51:37,090 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - ResourceManager
akka.tcp://[email protected]:6123/user/resourcemanager was
granted leadership with fencing token 00000000000000000000000000000000
2020-06-08 14:51:37,093 INFO
org.apache.flink.runtime.resourcemanager.slotmanager.SlotManagerImpl -
Starting the SlotManager.
2020-06-08 15:00:28,596 WARN
org.apache.flink.runtime.webmonitor.handlers.JarRunHandler - Configuring the
job submission via query parameters is deprecated. Please migrate to submitting
a JSON request instead.
2020-06-08 15:00:29,102 INFO REDACTED.ledger.Flow
- Using parameter file classpath:/application.properties
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme zip
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme par
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme res
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tar
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme sar
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tgz
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme war
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tbz2
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme file
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme gz
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme tmp
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ear
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ejb3
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme jar
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme bz2
2020-06-08 15:00:29,211 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ApacheVFSVirtualFileSystem@1def9b90 for scheme ram
2020-06-08 15:00:29,212 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.ClasspathVirtualFileSystem@540bc652 for scheme
classpath
2020-06-08 15:00:29,212 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.UrlVirtualFileSystem@26312f62 for scheme http
2020-06-08 15:00:29,213 INFO REDACTED.VirtualFileSystemManager -
Registering VFS REDACTED.UrlVirtualFileSystem@26312f62 for scheme https
2020-06-08 15:00:29,387 INFO REDACTED.ledger.Flow
- Enabling externalized checkpointing
2020-06-08 15:00:29,389 INFO REDACTED.ledger.Flow
- Configuring RocksDB Backend: s3://REDACTED/state
2020-06-08 15:00:29,484 INFO REDACTED.Redis
- redis - (cluster=true), connecting to host=REDACTED, port=REDACTED
2020-06-08 15:00:30,187 INFO io.lettuce.core.EpollProvider
- Starting with epoll library
2020-06-08 15:00:30,189 INFO io.lettuce.core.KqueueProvider
- Starting without optional kqueue library
2020-06-08 15:00:32,687 INFO org.apache.flink.api.java.typeutils.TypeExtractor
- class
org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit does
not contain a setter for field modificationTime
2020-06-08 15:00:32,687 INFO org.apache.flink.api.java.typeutils.TypeExtractor
- Class class
org.apache.flink.streaming.api.functions.source.TimestampedFileInputSplit
cannot be used as a POJO type because not all fields are valid POJO fields, and
must be processed as GenericType. Please read the Flink documentation on "Data
Types & Serialization" for details of the effect on performance.
2020-06-08 15:00:33,414 INFO REDACTED.ledger.Flow
- Job ID: 7b863a7272e6f37fc62b3300980d9db1
2020-06-08 15:00:37,519 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Received
JobGraph submission ac0ba486c934fc663d01e347323c785a (REDACTED).
2020-06-08 15:00:37,520 INFO
org.apache.flink.runtime.dispatcher.StandaloneDispatcher - Submitting job
ac0ba486c934fc663d01e347323c785a (REDACTED).
2020-06-08 15:00:37,570 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService
- Starting RPC endpoint for
org.apache.flink.runtime.jobmaster.JobMaster at akka://flink/user/jobmanager_0 .
2020-06-08 15:00:37,578 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Initializing job REDACTED (ac0ba486c934fc663d01e347323c785a).
2020-06-08 15:00:37,594 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Using restart back off time strategy
FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=10,
backoffTimeMS=60000) for REDACTED (ac0ba486c934fc663d01e347323c785a).
2020-06-08 15:00:37,675 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Running initialization on master for job REDACTED
(ac0ba486c934fc663d01e347323c785a).
2020-06-08 15:00:37,675 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Successfully ran initialization on master in 0 ms.
2020-06-08 15:00:37,701 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Using application-defined state backend:
RocksDBStateBackend{checkpointStreamBackend=File State Backend (checkpoints:
's3://REDACTED/state', savepoints: 'null', asynchronous: UNDEFINED,
fileStateThreshold: -1), localRocksDbDirectories=null,
enableIncrementalCheckpointing=TRUE, numberOfTransferThreads=-1,
writeBatchSize=-1}
2020-06-08 15:00:37,701 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Configuring application-defined state backend with job/cluster
config
2020-06-08 15:00:37,706 INFO
org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using
predefined options: FLASH_SSD_OPTIMIZED.
2020-06-08 15:00:37,707 INFO
org.apache.flink.contrib.streaming.state.RocksDBStateBackend - Using default
options factory: DefaultConfigurableOptionsFactory{configuredOptions={}}.
2020-06-08 15:00:38,703 INFO
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy
- Start building failover regions.
2020-06-08 15:00:38,704 INFO
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy
- Created 1 failover regions.
2020-06-08 15:00:38,704 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Using failover strategy
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy@601936a6
for REDACTED (ac0ba486c934fc663d01e347323c785a).
2020-06-08 15:00:38,706 INFO
org.apache.flink.runtime.jobmaster.JobManagerRunnerImpl - JobManager
runner for job REDACTED (ac0ba486c934fc663d01e347323c785a) was granted
leadership with session id 00000000-0000-0000-0000-000000000000 at
akka.tcp://[email protected]:6123/user/jobmanager_0.
2020-06-08 15:00:38,710 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Starting execution of job REDACTED
(ac0ba486c934fc663d01e347323c785a) under job master id
00000000000000000000000000000000.
2020-06-08 15:00:38,771 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Starting scheduling with scheduling strategy
[org.apache.flink.runtime.scheduler.strategy.EagerSchedulingStrategy]
2020-06-08 15:00:38,771 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Job REDACTED
(ac0ba486c934fc663d01e347323c785a) switched from state CREATED to RUNNING.
2020-06-08 15:00:38,781 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Joiner (1/1)
(10d1b92f690e3f4dae02fb806300a7d7) switched from CREATED to SCHEDULED.
2020-06-08 15:00:38,794 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Cannot serve
slot request, no ResourceManager connected. Adding as pending request
[SlotRequestId{9723c61fe85823a01c57b6b593d05662}]
2020-06-08 15:00:38,803 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Connecting to ResourceManager
akka.tcp://[email protected]:6123/user/resourcemanager(00000000000000000000000000000000)
2020-06-08 15:00:38,808 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Resolved ResourceManager address, beginning registration
2020-06-08 15:00:38,808 INFO org.apache.flink.runtime.jobmaster.JobMaster
- Registration at ResourceManager attempt 1 (timeout=100ms)
2020-06-08 15:00:38,810 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Registering job
manager
[email protected]://[email protected]:6123/user/jobmanager_0
for job ac0ba486c934fc663d01e347323c785a.
2020-06-08 15:00:38,867 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Registered job
manager
[email protected]://[email protected]:6123/user/jobmanager_0
for job ac0ba486c934fc663d01e347323c785a.
2020-06-08 15:00:38,870 INFO org.apache.flink.runtime.jobmaster.JobMaster
- JobManager successfully registered at ResourceManager, leader id:
00000000000000000000000000000000.
2020-06-08 15:00:38,871 INFO
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl - Requesting new
slot [SlotRequestId{9723c61fe85823a01c57b6b593d05662}] and profile
ResourceProfile{UNKNOWN} from resource manager.
2020-06-08 15:00:38,872 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Request slot
with profile ResourceProfile{UNKNOWN} for job ac0ba486c934fc663d01e347323c785a
with allocation id ef56d50dbc6cfa93973089506fc8244c.
2020-06-08 15:00:38,875 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Starting new
worker with resource profile, ResourceProfile{UNKNOWN}
2020-06-08 15:00:38,875 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - Requesting new
TaskManager pod with <1728,1.0>. Number pending requests 1.
2020-06-08 15:00:38,877 INFO
org.apache.flink.kubernetes.KubernetesResourceManager - TaskManager
ledger-flink-session-taskmanager-1-1 will be started with
TaskExecutorProcessSpec {cpuCores=1.0, frameworkHeapSize=128.000mb (134217728
bytes), frameworkOffHeapSize=128.000mb (134217728 bytes),
taskHeapSize=384.000mb (402653174 bytes), taskOffHeapSize=0 bytes,
networkMemSize=128.000mb (134217730 bytes), managedMemorySize=512.000mb
(536870920 bytes), jvmMetaspaceSize=256.000mb (268435456 bytes),
jvmOverheadSize=192.000mb (201326592 bytes)}.
2020-06-08 15:00:49,179 ERROR
org.apache.flink.kubernetes.KubernetesResourceManager - Could not start
TaskManager in pod ledger-flink-session-taskmanager-1-1.
java.util.concurrent.CompletionException:
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [create]
for kind: [Pod] with name: [null] in namespace: [default] failed.
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:273)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:280)
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1643)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: io.fabric8.kubernetes.client.KubernetesClientException: Operation:
[create] for kind: [Pod] with name: [null] in namespace: [default] failed.
at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:64)
at
io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:72)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:331)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:324)
at
org.apache.flink.kubernetes.kubeclient.Fabric8FlinkKubeClient.lambda$createTaskManagerPod$0(Fabric8FlinkKubeClient.java:184)
at
java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1640)
... 3 more
Caused by: java.net.SocketTimeoutException: timeout
at
org.apache.flink.kubernetes.shadded.okio.Okio$4.newTimeoutException(Okio.java:232)
at
org.apache.flink.kubernetes.shadded.okio.AsyncTimeout.exit(AsyncTimeout.java:285)
at
org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$2.read(AsyncTimeout.java:241)
at
org.apache.flink.kubernetes.shadded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:354)
at
org.apache.flink.kubernetes.shadded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:226)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http1.Http1Codec.readHeaderLine(Http1Codec.java:215)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:189)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:126)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
io.fabric8.kubernetes.client.utils.BackwardsCompatibilityInterceptor.intercept(BackwardsCompatibilityInterceptor.java:119)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
io.fabric8.kubernetes.client.utils.ImpersonatorInterceptor.intercept(ImpersonatorInterceptor.java:68)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
io.fabric8.kubernetes.client.utils.HttpClientUtils.lambda$createHttpClient$3(HttpClientUtils.java:110)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
at
org.apache.flink.kubernetes.shadded.okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
at
org.apache.flink.kubernetes.shadded.okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:254)
at
org.apache.flink.kubernetes.shadded.okhttp3.RealCall.execute(RealCall.java:92)
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:411)
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:372)
at
io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:241)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:798)
at
io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:328)
... 6 more
Caused by: java.net.SocketException: Socket closed
at java.net.SocketInputStream.read(SocketInputStream.java:204)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
at sun.security.ssl.InputRecord.read(InputRecord.java:503)
at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:990)
at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:948)
at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
at org.apache.flink.kubernetes.shadded.okio.Okio$2.read(Okio.java:140)
at
org.apache.flink.kubernetes.shadded.okio.AsyncTimeout$2.read(AsyncTimeout.java:237)
... 39 more
...
Repeats from Requesting new TaskManager pod with <1728,1.0>. Number pending
requests 1