[ 
https://issues.apache.org/jira/browse/BEAM-10691?focusedWorklogId=470681&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-470681
 ]

ASF GitHub Bot logged work on BEAM-10691:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 14/Aug/20 12:53
            Start Date: 14/Aug/20 12:53
    Worklog Time Spent: 10m 
      Work Description: je-ik commented on pull request #12551:
URL: https://github.com/apache/beam/pull/12551#issuecomment-674059727


   @mxm I think I figured it out. The current commit 
https://github.com/apache/beam/pull/12551/commits/fe75b7e01d55a0c24486b032330d26d66b57f9c2
 works for my pipeline. I think the explanation is that the pipeline was not 
really completely stuck, only so slow, it looked like that. 
PriorityQueue#remove(Object) has O(N) complexity and when there is really many, 
many, many timers (with nearly the same output timestamp), it would cause the 
Pipeline to stop running.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 470681)
    Time Spent: 5h 10m  (was: 5h)

> FlinkRunner: pipeline might get stuck due to timer watermark hold not being 
> released
> ------------------------------------------------------------------------------------
>
>                 Key: BEAM-10691
>                 URL: https://issues.apache.org/jira/browse/BEAM-10691
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-flink
>    Affects Versions: 2.23.0, 2.24.0
>            Reporter: Jan Lukavský
>            Assignee: Jan Lukavský
>            Priority: P1
>          Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Pipeline might stop progressing watermark in certain cases due to timer 
> output timestamp not being released from 
> FlinkTimerInternals#outputTimestampQueue. The pipeline has to be restarted 
> from checkpoint to reload the cache and free watermark hold.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to