[
https://issues.apache.org/jira/browse/FLINK-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645035#comment-14645035
]
ASF GitHub Bot commented on FLINK-2406:
---------------------------------------
Github user uce commented on the pull request:
https://github.com/apache/flink/pull/938#issuecomment-125751865
Out of curiosity: why was it failing sometimes on Travis and not locally?
And how did you discover this? From the program level logs?
Another thing that came to my mind: in the long run, do we need a more
complex way of configuring the retry policy? In my understanding, the number of
retries is fixed. I can see an issue for very long run programs, which fail
once in a while, but operate normally most of the time -- then at some point
they will fail because of the fixed number of retries.
> Abstract BarrierBuffer to an exchangeable BarrierHandler
> --------------------------------------------------------
>
> Key: FLINK-2406
> URL: https://issues.apache.org/jira/browse/FLINK-2406
> Project: Flink
> Issue Type: Sub-task
> Components: Streaming
> Affects Versions: 0.10
> Reporter: Stephan Ewen
> Assignee: Stephan Ewen
> Fix For: 0.10
>
>
> We need to make the Checkpoint handling pluggable, to allow us to use
> different implementations:
> - BarrierBuffer for "exactly once" processing. This inevitably introduces a
> bit of latency.
> - BarrierTracker for "at least once" processing, with no added latency.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)