A failure of a global checkpoint should only increment the failure count by one independent of the number of failed subtasks. Thus, I would hope that one does not need to set a different threshold for the two different cases you described @tweise. However, it is correct that the price the user pays is that a consistent problem will only be detected after a longer delay.
[ Full content available at: https://github.com/apache/flink/pull/6567 ] This message was relayed via gitbox.apache.org for [email protected]
