flink 1.10 onYarn
job 中 有一个MapState[Long,Bean]
https://www.helloimg.com/image/Pe1QR
程序启动一段时间(20分钟)后出现了 附件中的异常
查看对应源码也没看懂是什么引起的异常
https://www.helloimg.com/image/Peqc5
2020-07-02 17:06:19,409 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (2/3) (d847db42ed1d92ac373f9ccf27b846f0) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,410 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (3/3) (ca825ba9712eb520ff6de6b0f9de4dc1) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,426 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (3/3) (b05f5b66fd4c65a9032bb0140a4ce3d1) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,427 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (2/3) (d5fe791177a64ea718eec61b82542e46) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,472 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - groupBy:
(userId, sourceUuid, categoryId, gender), select: (userId, sourceUuid,
categoryId, gender, MAX(siteId) AS siteId, MAX(score) AS score) -> select:
(userId, categoryId, gender, score, siteId) (1/3)
(f8f49357b9121d97816d5f83569cd6ac) switched from DEPLOYING to RUNNING.
2020-07-02 17:06:19,472 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (1/3) (37befed4aefab35588e5f6d4c372b8c4) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,492 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (1/3) (2de7f2a3809e8d7e97197cbc0f7c8b4b) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:19,498 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (1/3) (1fe6456a019617839a573d55b1194541) switched from DEPLOYING
to RUNNING.
2020-07-02 17:06:52,263 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (2/3) (d5fe791177a64ea718eec61b82542e46) switched from RUNNING to
FAILED on
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@61d8240a.
java.lang.IllegalArgumentException: Position out of bounds.
at
org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:139)
at
org.apache.flink.core.memory.DataOutputSerializer.setPosition(DataOutputSerializer.java:368)
at
org.apache.flink.contrib.streaming.state.RocksDBSerializedCompositeKeyBuilder.resetToKey(RocksDBSerializedCompositeKeyBuilder.java:189)
at
org.apache.flink.contrib.streaming.state.RocksDBSerializedCompositeKeyBuilder.buildCompositeKeyNamesSpaceUserKey(RocksDBSerializedCompositeKeyBuilder.java:144)
at
org.apache.flink.contrib.streaming.state.AbstractRocksDBState.serializeCurrentKeyWithGroupAndNamespacePlusUserKey(AbstractRocksDBState.java:149)
at
org.apache.flink.contrib.streaming.state.RocksDBMapState.get(RocksDBMapState.java:120)
at
org.apache.flink.runtime.state.ttl.TtlMapState.lambda$getWrapped$0(TtlMapState.java:61)
at
org.apache.flink.runtime.state.ttl.AbstractTtlDecorator.getWrappedWithTtlCheckAndUpdate(AbstractTtlDecorator.java:92)
at
org.apache.flink.runtime.state.ttl.TtlMapState.getWrapped(TtlMapState.java:60)
at
org.apache.flink.runtime.state.ttl.TtlMapState.contains(TtlMapState.java:93)
at
org.apache.flink.runtime.state.UserFacingMapState.contains(UserFacingMapState.java:72)
at
com.netease.wm.trace.usertag.RealTimeUserTag$UserTimeTradeProcess.processElement(RealTimeUserTag.scala:272)
at
com.netease.wm.trace.usertag.RealTimeUserTag$UserTimeTradeProcess.processElement(RealTimeUserTag.scala:237)
at
org.apache.flink.streaming.api.operators.KeyedProcessOperator.processElement(KeyedProcessOperator.java:85)
at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:310)
at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:485)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:469)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:708)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:533)
at java.lang.Thread.run(Thread.java:745)
2020-07-02 17:06:52,264 INFO
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy
- Calculating tasks to restart to recover the failed task
bad2e596adf8fb687a74c9b1f886bf29_1.
2020-07-02 17:06:52,264 INFO
org.apache.flink.runtime.executiongraph.failover.flip1.RestartPipelinedRegionStrategy
- 27 tasks should be restarted to recover the failed task
bad2e596adf8fb687a74c9b1f886bf29_1.
2020-07-02 17:06:52,264 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Job wending
realTime userTag (4244fd52ee14a2c36f4206b4f50c8a0b) switched from state RUNNING
to RESTARTING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - groupBy:
(userId, sourceUuid, categoryId, gender), select: (userId, sourceUuid,
categoryId, gender, MAX(siteId) AS siteId, MAX(score) AS score) -> select:
(userId, categoryId, gender, score, siteId) (1/3)
(f8f49357b9121d97816d5f83569cd6ac) switched from RUNNING to CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (1/3) (37befed4aefab35588e5f6d4c372b8c4) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (2/3) (325cdd7bac19b5c6ee04c3b5280bab88) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (3/3) (96dacc54145d41172bab1b7d844c7364) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - groupBy:
(userId, sourceUuid, categoryId, gender), select: (userId, sourceUuid,
categoryId, gender, MAX(siteId) AS siteId, MAX(score) AS score) -> select:
(userId, categoryId, gender, score, siteId) (3/3)
(751e52819bdb8419ca75cf8a6e320d75) switched from RUNNING to CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - groupBy:
(userId, sourceUuid, categoryId, gender), select: (userId, sourceUuid,
categoryId, gender, MAX(siteId) AS siteId, MAX(score) AS score) -> select:
(userId, categoryId, gender, score, siteId) (2/3)
(fb61d9b63525eb7578f7743d21bbeafd) switched from RUNNING to CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (1/3) (2de7f2a3809e8d7e97197cbc0f7c8b4b) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - Source: Custom
Source -> Map -> Filter -> Timestamps/Watermarks -> Filter -> (Filter -> Map,
Filter) (3/3) (fa41ad2f5be82768b19c3dfc23b77069) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (2/3) (d847db42ed1d92ac373f9ccf27b846f0) switched from RUNNING to
CANCELING.
2020-07-02 17:06:52,265 INFO
org.apache.flink.runtime.executiongraph.ExecutionGraph - KeyedProcess ->
Sink: Unnamed (3/3) (ca825ba9712eb520ff6de6b0f9de4dc1) switched from RUNNING to
CANCELING.