gyfora opened a new pull request, #571:
URL: https://github.com/apache/flink-kubernetes-operator/pull/571
## What is the purpose of the change
This change introduces per-edge output ratio tracking for the JobGrapgh
instead of the previous per-vertex tracking.
The previous approach assumed that all downstream vertices will receive the
same output (all edges have the same num records out) which is not true when
side output or more complex operator chains are present in the JobVertex.
While Flink does not directly expose per edge for the sent/received records,
in many cases this can be computed.
- If we have a single output then we use num records out
- If the downstream vertex has a single input we use the downstream num
records in
- If the downstream vertex has only inputs with a single output then we
subtrack the upstream numRecrods out from other inputs from the num records in
As a related change we should also introduce per edge record count metrics
in Flink which would allow us to use that in the autoscaler algorithm if
enabled.
## Brief change log
- *Remove per-vertex OUTPUT_RATIO and TRUE_OUTPUT_RATE metrics*
- *Rework the collected metrics / metric history to allow storing per-edge
output ratio metrics*
- *Compute per-edge output ratio depending on the topology*
- *Use downstream num records in whenever possible instead of upstream num
records out as the latter is very unreliable due to some Flink bugs
(https://issues.apache.org/jira/browse/FLINK-18808 &
https://issues.apache.org/jira/browse/FLINK-31752)*
## Verifying this change
New unit tests added for the output ratio computation and existing tests
cover the current behaviour.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): no
- The public API, i.e., is any changes to the `CustomResourceDescriptors`:
no
- Core observer or reconciler logic that is regularly executed: no
## Documentation
- Does this pull request introduce a new feature? no
- If yes, how is the feature documented? not applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]