[
https://issues.apache.org/jira/browse/FLINK-19009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kevin Liu updated FLINK-19009:
------------------------------
Comment: was deleted
(was: I think I can fix this bug, but first we need to reach a consensus on the
definition of 'a failing/recovering situation' in Flink Docs. There are 10
types of JobStatus. And 'FAILING' is the case. But what about the others? For
example, 'RECONCILING'. ([~jark] What do you think, or do you know someone
familiar with this part?))
> wrong way to calculate the "downtime" metric
> --------------------------------------------
>
> Key: FLINK-19009
> URL: https://issues.apache.org/jira/browse/FLINK-19009
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination, Runtime / Metrics
> Affects Versions: 1.7.2, 1.8.0
> Reporter: Zhinan Cheng
> Priority: Trivial
> Fix For: 1.12.0
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> Currently the way to calculate the Flink system metric "downtime" is not
> consistent with the description in the doc, now the downtime is actually the
> current timestamp minus the time timestamp when the job started.
>
> But Flink doc (https://flink.apache.org/gettinghelp.html) obviously describes
> the time as the current timestamp minus the timestamp when the job failed.
>
> I believe we should update the code this metric as the Flink doc shows. The
> easy way to solve this is using the current timestamp to minus the latest
> uptime timestamp.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)