No, I guess it's stable.
2021-11-02 22:41:08,276 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2021-11-02 22:41:08,292 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
StandaloneSessionClusterEntrypoint (Version: 1.10.0, Rev:aa4eb8f,
Date:07.02.2020 @ 19:18:19 CET)
2021-11-02 22:41:08,292 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - OS
current user: flink
2021-11-02 22:41:08,304 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Current
Hadoop/Kerberos user: <no hadoop dependency found>
2021-11-02 22:41:08,304 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM:
OpenJDK 64-Bit Server VM - Private Build - 1.8/25.292-b10
2021-11-02 22:41:08,306 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Maximum
heap size: 2944 MiBytes
2021-11-02 22:41:08,306 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
JAVA_HOME: (not set)
2021-11-02 22:41:08,311 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - No Hadoop
Dependency available
2021-11-02 22:41:08,311 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - JVM
Options:
2021-11-02 22:41:08,313 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Xms3072m
2021-11-02 22:41:08,313 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Xmx3072m
2021-11-02 22:41:08,313 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog.file=/opt/flink-1.10.0/log/flink-flink-standalonesession-0-xxxxxxjob-0003.log
2021-11-02 22:41:08,313 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlog4j.configuration=file:/opt/flink-1.10.0/conf/log4j.properties
2021-11-02 22:41:08,313 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
-Dlogback.configurationFile=file:/opt/flink-1.10.0/conf/logback.xml
2021-11-02 22:41:08,314 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Program
Arguments:
2021-11-02 22:41:08,317 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--configDir
2021-11-02 22:41:08,318 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
/opt/flink-1.10.0/conf
2021-11-02 22:41:08,318 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--executionMode
2021-11-02 22:41:08,318 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - cluster
2021-11-02 22:41:08,318 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --host
2021-11-02 22:41:08,318 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
xxxxxxjob-0003
2021-11-02 22:41:08,329 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--webui-port
2021-11-02 22:41:08,330 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 8081
2021-11-02 22:41:08,330 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
Classpath:
/opt/flink-1.10.0/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink-1.10.0/lib/flink-table_2.12-1.10.0.jar:/opt/flink-1.10.0/lib/log4j-1.2.17.jar:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar:/opt/flink-1.10.0/lib/flink-dist_2.12-1.10.0.jar:::
2021-11-02 22:41:08,330 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
--------------------------------------------------------------------------------
2021-11-02 22:41:08,362 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Registered
UNIX signal handlers for [TERM, HUP, INT]
2021-11-02 22:41:08,558 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: env.ssh.opts, -l flink -oStrictHostKeyChecking=no
2021-11-02 22:41:08,558 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: cluster.evenly-spread-out-slots, true
2021-11-02 22:41:08,559 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: jobmanager.heap.size, 3072m
2021-11-02 22:41:08,559 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.memory.flink.size, 3072m
2021-11-02 22:41:08,559 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.memory.jvm-metaspace.size, 256m
2021-11-02 22:41:08,559 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: taskmanager.numberOfTaskSlots, 8
2021-11-02 22:41:08,560 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: parallelism.default, 1
2021-11-02 22:41:08,560 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: high-availability, zookeeper
2021-11-02 22:41:08,560 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: high-availability.storageDir,
file:///mnt/flink/ha/flink_1_10/
2021-11-02 22:41:08,560 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: high-availability.zookeeper.quorum,
xxxxxx-0001.xxxxxx.xxxxxx:2181,xxxxxx-0002.xxxxxx.xxxxxx:2181,xxxxxx-0003.xxxxxx.xxxxxx:2181
2021-11-02 22:41:08,561 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: high-availability.zookeeper.path.root, /flink_1_10
2021-11-02 22:41:08,561 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: high-availability.cluster-id,
/flink_1_10_cluster_0001
2021-11-02 22:41:08,561 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: web.upload.dir, /mnt/flink/uploads/flink_1_10
2021-11-02 22:41:08,562 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: state.backend, filesystem
2021-11-02 22:41:08,562 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: state.checkpoints.dir,
file:///mnt/flink/checkpoints/flink_1_10
2021-11-02 22:41:08,562 INFO
org.apache.flink.configuration.GlobalConfiguration - Loading
configuration property: state.savepoints.dir,
file:///mnt/flink/savepoints/flink_1_10
2021-11-02 22:41:09,935 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Starting
StandaloneSessionClusterEntrypoint.
2021-11-02 22:41:09,935 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install
default filesystem.
2021-11-02 22:41:10,405 INFO org.apache.flink.xxxxxx.fs.FileSystem
- Hadoop is not in the classpath/dependencies. The
extended set of supported File Systems via Hadoop is not available.
2021-11-02 22:41:10,482 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Install
security context.
2021-11-02 22:41:10,516 INFO
org.apache.flink.runtime.security.modules.HadoopModuleFactory - Cannot
create Hadoop Security Module because Hadoop cannot be found in the
Classpath.
2021-11-02 22:41:10,615 INFO
org.apache.flink.runtime.security.modules.JaasModule - Jaas file
will be created as /tmp/jaas-7770543068119743820.conf.
2021-11-02 22:41:10,638 INFO
org.apache.flink.runtime.security.SecurityUtils - Cannot
install HadoopSecurityContext because Hadoop cannot be found in the
Classpath.
2021-11-02 22:41:10,639 INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -
Initializing cluster services.
2021-11-02 22:41:10,744 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to
start actor system at xxxxxxjob-0003:0
2021-11-02 22:41:13,357 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2021-11-02 22:41:13,459 INFO akka.remote.Remoting
- Starting remoting
2021-11-02 22:41:14,277 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://flink@xxxxxxjob-0003:39977]
2021-11-02 22:41:14,868 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor
system started at akka.tcp://flink@xxxxxxjob-0003:39977
2021-11-02 22:41:14,966 INFO
org.apache.flink.runtime.blob.FileSystemBlobStore - Creating
highly available BLOB storage directory at
file:/mnt/flink/ha/flink_1_10/flink_1_10_cluster_0001/blob
2021-11-02 22:41:14,990 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Enforcing default ACL for ZK connections
2021-11-02 22:41:14,991 INFO org.apache.flink.runtime.util.ZooKeeperUtils
- Using '/flink_1_10/flink_1_10_cluster_0001' as Zookeeper
namespace.
2021-11-02 22:41:15,247 INFO
org.apache.flink.shaded.curator.org.apache.curator.framework.imps.CuratorFrameworkImpl
- Starting
2021-11-02 22:41:15,267 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:zookeeper.version=3.4.10-39d3a4f269333c922ed3db283be479f9deacaa0f,
built on 03/23/2017 10:13 GMT
2021-11-02 22:41:15,267 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:host.name=xxxxxxjob-0003
2021-11-02 22:41:15,281 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.version=1.8.0_292
2021-11-02 22:41:15,281 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.vendor=Private Build
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.home=/usr/lib/jvm/java-8-openjdk-amd64/jre
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.class.path=/opt/flink-1.10.0/lib/flink-table-blink_2.12-1.10.0.jar:/opt/flink-1.10.0/lib/flink-table_2.12-1.10.0.jar:/opt/flink-1.10.0/lib/log4j-1.2.17.jar:/opt/flink-1.10.0/lib/slf4j-log4j12-1.7.15.jar:/opt/flink-1.10.0/lib/flink-dist_2.12-1.10.0.jar:::
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.io.tmpdir=/tmp
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:java.compiler=<NA>
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:os.name=Linux
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:os.arch=amd64
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:os.version=4.15.0-161-generic
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:user.name=flink
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:user.home=/home/flink
2021-11-02 22:41:15,282 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper - Client
environment:user.dir=/opt/flink-1.10.0
2021-11-02 22:41:15,283 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ZooKeeper -
Initiating client connection,
connectString=xxxxxx-0001.xxxxxx.xxxxxx:2181,xxxxxx-0002.xxxxxx.xxxxxx:2181,xxxxxx-0003.xxxxxx.xxxxxx:2181
sessionTimeout=60000
watcher=org.apache.flink.shaded.curator.org.apache.curator.ConnectionState@27216cd
2021-11-02 22:41:15,377 WARN
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn - SASL
configuration failed: javax.security.auth.login.LoginException: No JAAS
configuration section named 'Client' was found in specified JAAS
configuration file: '/tmp/jaas-7770543068119743820.conf'. Will continue
connection to Zookeeper server without SASL authentication, if Zookeeper
server allows it.
2021-11-02 22:41:15,379 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn -
Opening socket connection to server xxxxxx.35/xxxxxx.35:2181
2021-11-02 22:41:15,386 ERROR
org.apache.flink.shaded.curator.org.apache.curator.ConnectionState -
Authentication failed
2021-11-02 22:41:15,396 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn -
Socket connection established to xxxxxx.35/xxxxxx.35:2181, initiating
session
2021-11-02 22:41:15,421 INFO
org.apache.flink.shaded.zookeeper.org.apache.zookeeper.ClientCnxn -
Session establishment complete on server xxxxxx.35/xxxxxx.35:2181,
sessionid = 0x200000086a20007, negotiated timeout = 40000
2021-11-02 22:41:15,425 INFO
org.apache.flink.shaded.curator.org.apache.curator.framework.state.ConnectionStateManager
- State change: CONNECTED
2021-11-02 22:41:15,438 INFO org.apache.flink.runtime.blob.BlobServer
- Created BLOB server storage directory
/tmp/blobStore-9cb73f27-11db-4c42-a3fc-9b77f558e722
2021-11-02 22:41:15,451 INFO org.apache.flink.runtime.blob.BlobServer
- Started BLOB server at 0.0.0.0:34845 - max concurrent
requests: 50 - max backlog: 1000
2021-11-02 22:41:15,496 INFO
org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics
reporter configured, no metrics will be exposed/reported.
2021-11-02 22:41:15,509 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Trying to
start actor system at xxxxxxjob-0003:0
2021-11-02 22:41:15,624 INFO akka.event.slf4j.Slf4jLogger
- Slf4jLogger started
2021-11-02 22:41:15,654 INFO akka.remote.Remoting
- Starting remoting
2021-11-02 22:41:15,700 INFO akka.remote.Remoting
- Remoting started; listening on addresses
:[akka.tcp://flink-metrics@xxxxxxjob-0003:38997]
2021-11-02 22:41:15,733 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcServiceUtils - Actor
system started at akka.tcp://flink-metrics@xxxxxxjob-0003:38997
2021-11-02 22:41:15,755 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting
RPC endpoint for org.apache.flink.runtime.metrics.dump.MetricQueryService
at akka://flink-metrics/user/MetricQueryService .
2021-11-02 22:41:16,379 INFO
org.apache.flink.runtime.dispatcher.FileArchivedExecutionGraphStore -
Initializing FileArchivedExecutionGraphStore: Storage directory
/tmp/executionGraphStore-40cf7548-25fc-4b2b-a6a8-d504eb611847, expiration
time 3600000, maximum cache size 52428800 bytes.
2021-11-02 22:41:16,526 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key
'jobmanager.rpc.address' instead of key 'rest.address'
2021-11-02 22:41:16,526 INFO org.apache.flink.configuration.Configuration
- Config uses fallback configuration key 'rest.port'
instead of key 'rest.bind-port'
2021-11-02 22:41:16,536 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Upload
directory /mnt/flink/uploads/flink_1_10/flink-web-upload does not exist.
2021-11-02 22:41:16,558 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Created
directory /mnt/flink/uploads/flink_1_10/flink-web-upload for file uploads.
2021-11-02 22:41:16,563 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Starting
rest endpoint.
2021-11-02 22:41:17,262 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component log file:
/opt/flink-1.10.0/log/flink-flink-standalonesession-0-xxxxxxjob-0003.log
2021-11-02 22:41:17,263 INFO
org.apache.flink.runtime.webmonitor.WebMonitorUtils - Determined
location of main cluster component stdout file:
/opt/flink-1.10.0/log/flink-flink-standalonesession-0-xxxxxxjob-0003.out
2021-11-02 22:41:18,135 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Rest
endpoint listening at xxxxxxjob-0003:8081
2021-11-02 22:41:18,145 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
ZooKeeperLeaderElectionService{leaderPath='/leader/rest_server_lock'}.
2021-11-02 22:41:18,303 INFO
org.apache.flink.runtime.dispatcher.DispatcherRestEndpoint - Web
frontend listening at http://xxxxxxjob-0003:8081.
2021-11-02 22:41:18,385 INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService - Starting
RPC endpoint for
org.apache.flink.runtime.resourcemanager.StandaloneResourceManager at
akka://flink/user/resourcemanager .
2021-11-02 22:41:18,430 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
ZooKeeperLeaderElectionService{leaderPath='/leader/dispatcher_lock'}.
2021-11-02 22:41:18,431 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService
- Starting ZooKeeperLeaderRetrievalService /leader/resource_manager_lock.
2021-11-02 22:41:18,431 INFO
org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService
- Starting ZooKeeperLeaderRetrievalService /leader/dispatcher_lock.
2021-11-02 22:41:18,437 INFO
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService -
Starting ZooKeeperLeaderElectionService
ZooKeeperLeaderElectionService{leaderPath='/leader/resource_manager_lock'}.
2021-11-02 23:20:22,682 ERROR
org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler
- Failed to transfer file from TaskExecutor
7e1b7db5918004e4160fdecec1bbdad7.
java.util.concurrent.CompletionException:
org.apache.flink.util.FlinkException: Could not retrieve file from
transient blob store.
at
org.apache.flink.runtime.rest.handler.taskmanager.AbstractTaskManagerFileHandler.lambda$respondToRequest$0(AbstractTaskManagerFileHandler.java:135)
at
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670)
at
java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.flink.util.FlinkException: Could not retrieve file
from transient blob store.
... 10 more
Caused by: java.io.FileNotFoundException: Local file
/tmp/blobStore-9cb73f27-11db-4c42-a3fc-9b77f558e722/no_job/blob_t-274d3c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
does not exist and failed to copy from blob store.
at
org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:516)
at
org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:444)
at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:369)
at
org.apache.flink.runtime.rest.handler.taskmanager.AbstractTaskManagerFileHandler.lambda$respondToRequest$0(AbstractTaskManagerFileHandler.java:133)
... 9 more
2021-11-02 23:20:22,703 ERROR
org.apache.flink.runtime.rest.handler.taskmanager.TaskManagerLogFileHandler
- Unhandled exception.
org.apache.flink.util.FlinkException: Could not retrieve file from
transient blob store.
at
org.apache.flink.runtime.rest.handler.taskmanager.AbstractTaskManagerFileHandler.lambda$respondToRequest$0(AbstractTaskManagerFileHandler.java:135)
at
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670)
at
java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:456)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
at
org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:515)
at
org.apache.flink.shaded.netty4.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at
org.apache.flink.shaded.netty4.io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.FileNotFoundException: Local file
/tmp/blobStore-9cb73f27-11db-4c42-a3fc-9b77f558e722/no_job/blob_t-274d3c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
does not exist and failed to copy from blob store.
at
org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:516)
at
org.apache.flink.runtime.blob.BlobServer.getFileInternal(BlobServer.java:444)
at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:369)
at
org.apache.flink.runtime.rest.handler.taskmanager.AbstractTaskManagerFileHandler.lambda$respondToRequest$0(AbstractTaskManagerFileHandler.java:133)
... 9 more
2021-11-02 23:47:57,865 WARN akka.remote.transport.netty.NettyTransport
- Remote connection to [xxxxxxjob-0001/xxxxxx.72:37007]
failed with java.io.IOException: Connection reset by peer
2021-11-02 23:47:57,912 WARN akka.remote.ReliableDeliverySupervisor
- Association with remote system
[akka.tcp://flink@xxxxxxjob-0001:37007] has failed, address is now gated
for [50] ms. Reason: [Disassociated]
2021-11-02 23:53:41,565 WARN akka.remote.transport.netty.NettyTransport
- Remote connection to [xxxxxxjob-0001/xxxxxx.72:42961]
failed with java.io.IOException: Connection reset by peer
2021-11-02 23:53:41,571 WARN akka.remote.ReliableDeliverySupervisor
- Association with remote system
[akka.tcp://flink-metrics@xxxxxxjob-0001:42961] has failed, address is now
gated for [50] ms. Reason: [Disassociated]
On Thu, 4 Nov 2021 at 03:45, Guowei Ma <guowei....@gmail.com> wrote:
> >>>Ok I missed the log below. I guess when the task manager was stopped
> this happened.
> I think if the TM stopped you also would not get the log. But It will
> throw another "UnknownTaskExecutorException", which would include something
> like “No TaskExecutor registered under ”.
>
> >>> But I guess it's ok and not a big issue???
> Does this happen continuously?
>
> Best,
> Guowei
>
>
> On Thu, Nov 4, 2021 at 12:39 AM John Smith <java.dev....@gmail.com> wrote:
>
>> Ok I missed the log below. I guess when the task manager was stopped this
>> happened.
>>
>> I attached the full sequence. But I guess it's ok and not a big issue???
>>
>>
>> 2021-11-02 23:20:22,682 ERROR
>> org.apache.flink.runtime.rest.handler.taskmanager.
>> TaskManagerLogFileHandler - Failed to transfer file from TaskExecutor 7e1
>> b7db5918004e4160fdecec1bbdad7.
>> java.util.concurrent.CompletionException: org.apache.flink.util.
>> FlinkException: Could not retrieve file from transient blob store.
>> at org.apache.flink.runtime.rest.handler.taskmanager.
>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>> AbstractTaskManagerFileHandler.java:135)
>> at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture
>> .java:670)
>> at java.util.concurrent.CompletableFuture$UniAccept.tryFire(
>> CompletableFuture.java:646)
>> at java.util.concurrent.CompletableFuture$Completion.run(
>> CompletableFuture.java:456)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
>> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop
>> .run(NioEventLoop.java:515)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
>> at org.apache.flink.shaded.netty4.io.netty.util.internal.
>> ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: org.apache.flink.util.FlinkException: Could not retrieve file
>> from transient blob store.
>> ... 10 more
>> Caused by: java.io.FileNotFoundException: Local file /tmp/blobStore-9
>> cb73f27-11db-4c42-a3fc-9b77f558e722/no_job/blob_t-274d3
>> c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
>> does not exist and failed to copy from blob store.
>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>> BlobServer.java:516)
>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>> BlobServer.java:444)
>> at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:
>> 369)
>> at org.apache.flink.runtime.rest.handler.taskmanager.
>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>> AbstractTaskManagerFileHandler.java:133)
>> ... 9 more
>> 2021-11-02 23:20:22,703 ERROR
>> org.apache.flink.runtime.rest.handler.taskmanager.
>> TaskManagerLogFileHandler - Unhandled exception.
>> org.apache.flink.util.FlinkException: Could not retrieve file from
>> transient blob store.
>> at org.apache.flink.runtime.rest.handler.taskmanager.
>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>> AbstractTaskManagerFileHandler.java:135)
>> at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture
>> .java:670)
>> at java.util.concurrent.CompletableFuture$UniAccept.tryFire(
>> CompletableFuture.java:646)
>> at java.util.concurrent.CompletableFuture$Completion.run(
>> CompletableFuture.java:456)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
>> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop
>> .run(NioEventLoop.java:515)
>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
>> at org.apache.flink.shaded.netty4.io.netty.util.internal.
>> ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>> at java.lang.Thread.run(Thread.java:748)
>> Caused by: java.io.FileNotFoundException: Local file /tmp/blobStore-9
>> cb73f27-11db-4c42-a3fc-9b77f558e722/no_job/blob_t-274d3
>> c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
>> does not exist and failed to copy from blob store.
>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>> BlobServer.java:516)
>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>> BlobServer.java:444)
>> at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer.java:
>> 369)
>> at org.apache.flink.runtime.rest.handler.taskmanager.
>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>> AbstractTaskManagerFileHandler.java:133)
>> ... 9 more
>>
>> On Wed, 3 Nov 2021 at 02:48, Guowei Ma <guowei....@gmail.com> wrote:
>>
>>> Hi, Smith
>>>
>>> It seems that the log file(blob_t-274d3c2d5acd78ced877d89
>>> 8b1877b10b62a64df-590b54325d599a6782a77413691e0a7b) is deleted for some
>>> reason. But AFAIK there are no other guys reporting this exception.(Maybe
>>> other guys know what would happen).
>>> 1. I think if you could refresh the page and you would see the correct
>>> result because this would trigger another file retrieving from TM.
>>> 2. And It might be more safe that setting an dedicated blob
>>> directory path(other than /tmp) `blob.storage.directory`[1]
>>>
>>> [1]
>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/config/#blob-storage-directory
>>>
>>>
>>> Best,
>>> Guowei
>>>
>>>
>>> On Wed, Nov 3, 2021 at 7:50 AM John Smith <java.dev....@gmail.com>
>>> wrote:
>>>
>>>> Hi running Flink 1.10.0 With 3 zookeepers, 3 job nodes and 3 task
>>>> nodes. and I saw this exception on the job node logs...
>>>> 2021-11-02 23:20:22,703 ERROR
>>>> org.apache.flink.runtime.rest.handler.taskmanager.
>>>> TaskManagerLogFileHandler - Unhandled exception.
>>>> org.apache.flink.util.FlinkException: Could not retrieve file from
>>>> transient blob store.
>>>> at org.apache.flink.runtime.rest.handler.taskmanager.
>>>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>>>> AbstractTaskManagerFileHandler.java:135)
>>>> at java.util.concurrent.CompletableFuture.uniAccept(
>>>> CompletableFuture.java:670)
>>>> at java.util.concurrent.CompletableFuture$UniAccept.tryFire(
>>>> CompletableFuture.java:646)
>>>> at java.util.concurrent.CompletableFuture$Completion.run(
>>>> CompletableFuture.java:456)
>>>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>>>> AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
>>>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>>>> SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:
>>>> 416)
>>>> at org.apache.flink.shaded.netty4.io.netty.channel.nio.NioEventLoop
>>>> .run(NioEventLoop.java:515)
>>>> at org.apache.flink.shaded.netty4.io.netty.util.concurrent.
>>>> SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
>>>> at org.apache.flink.shaded.netty4.io.netty.util.internal.
>>>> ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>> Caused by: java.io.FileNotFoundException: Local file /tmp/blobStore-9
>>>> cb73f27-11db-4c42-a3fc-9b77f558e722/no_job/blob_t-274d3
>>>> c2d5acd78ced877d898b1877b10b62a64df-590b54325d599a6782a77413691e0a7b
>>>> does not exist and failed to copy from blob store.
>>>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>>>> BlobServer.java:516)
>>>> at org.apache.flink.runtime.blob.BlobServer.getFileInternal(
>>>> BlobServer.java:444)
>>>> at org.apache.flink.runtime.blob.BlobServer.getFile(BlobServer
>>>> .java:369)
>>>> at org.apache.flink.runtime.rest.handler.taskmanager.
>>>> AbstractTaskManagerFileHandler.lambda$respondToRequest$0(
>>>> AbstractTaskManagerFileHandler.java:133)
>>>> ... 9 more
>>>>
>>>