[ 
https://issues.apache.org/jira/browse/BEAM-7112?focusedWorklogId=230448&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-230448
 ]

ASF GitHub Bot logged work on BEAM-7112:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Apr/19 15:15
            Start Date: 20/Apr/19 15:15
    Worklog Time Spent: 10m 
      Work Description: tweise commented on issue #8351: [BEAM-7112] [flink] 
Defer state cleanup till bundle completion
URL: https://github.com/apache/beam/pull/8351#issuecomment-485135041
 
 
   When running this with checkpointing enabled, I get this error:
   ```
   AsynchronousException{java.lang.Exception: Could not materialize checkpoint 
1 for operator [1]statefulCount (89/128).}
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointExceptionHandler.tryHandleCheckpointException(StreamTask.java:1154)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:948)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:885)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
   Caused by: java.lang.Exception: Could not materialize checkpoint 1 for 
operator [1]statefulCount (89/128).
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.handleExecutionException(StreamTask.java:943)
        ... 6 more
   Caused by: java.util.concurrent.ExecutionException: 
java.lang.ArrayIndexOutOfBoundsException: -175
        at java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.util.concurrent.FutureTask.get(FutureTask.java:192)
        at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:53)
        at 
org.apache.flink.streaming.api.operators.OperatorSnapshotFinalizer.<init>(OperatorSnapshotFinalizer.java:47)
        at 
org.apache.flink.streaming.runtime.tasks.StreamTask$AsyncCheckpointRunnable.run(StreamTask.java:854)
        ... 5 more
   Caused by: java.lang.ArrayIndexOutOfBoundsException: -175
        at 
org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.partitionEntriesByKeyGroup(CopyOnWriteStateTableSnapshot.java:147)
        at 
org.apache.flink.runtime.state.heap.CopyOnWriteStateTableSnapshot.writeMappingsInKeyGroup(CopyOnWriteStateTableSnapshot.java:178)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackend$HeapSnapshotStrategy$1.performOperation(HeapKeyedStateBackend.java:697)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackend$HeapSnapshotStrategy$1.performOperation(HeapKeyedStateBackend.java:641)
        at 
org.apache.flink.runtime.io.async.AbstractAsyncCallableWithResources.call(AbstractAsyncCallableWithResources.java:75)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
org.apache.flink.util.FutureUtil.runIfNotDoneAndGet(FutureUtil.java:50)
        ... 7 more
   ```
   This shows up after clearing state with the correct key.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 230448)
    Time Spent: 2h 40m  (was: 2.5h)

> State cleanup interferes with user timer callback
> -------------------------------------------------
>
>                 Key: BEAM-7112
>                 URL: https://issues.apache.org/jira/browse/BEAM-7112
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.12.0
>            Reporter: Thomas Weise
>            Assignee: Thomas Weise
>            Priority: Major
>              Labels: portability-flink
>          Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Cleanup timers and user timers are fired at the watermark. Processing of 
> timers in the SDK worker is asynchronous, so it is possible that the state is 
> already removed when the user timer callback executes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to