[
https://issues.apache.org/jira/browse/FLINK-7876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212469#comment-16212469
]
ASF GitHub Bot commented on FLINK-7876:
---------------------------------------
GitHub user tillrohrmann opened a pull request:
https://github.com/apache/flink/pull/4872
[FLINK-7876] Register TaskManagerMetricGroup under ResourceID
## What is the purpose of the change
This commit changes that TaskManagerMetricGroups are now registered under
the
TaskManager's ResourceID instead of the InstanceID. This allows to create
the
TaskManagerMetricGroup at startup of the TaskManager.
Moreover, it pulls the MetricRegistry out of JobManager and TaskManager.
This
allows to reuse the same MetricRegistry across multiple instances (e.g. in
the
FlinkMiniCluster case). Moreover, it ensures proper cleanup of a potentially
started MetricyQueryServiceActor.
Last but not least the PR corrects how the `MetricRegistry` is instantiated
in the `ClusterEntrypoint`.
## Brief change log
- Pull MetricRegistry out of `JobManager` and `TaskManager`
- Provide `TaskManagerMetricGroup` and `JobManagerMetricGroup` at start up
of respective component
- Register `TaskManagerMetricGroup` under `TaskManager's` `ResourceID`
- Adapt `TaskManagersHandler` to serve `ResourceID` instead of `InstanceID`
- Adapt `MetricFetcher` to retain `TaskManagers` based on `ResourceID`
- Correct instantiation of `MetricRegistry` in `ClusterEntrypoint`
- Make `MetricRegistry` an interface and rename `MetricRegistry` into
`MetricRegistryImpl` as one of the implementations of the `MetricRegistry`
interface
- Introduce `NoOpMetricRegistry` implementation which simply does nothing
and can be used when testing components for which one does not want to register
metrics
## Verifying this change
Tested manually.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (no)
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: (no)
## Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (not applicable)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/tillrohrmann/flink
taskManagerMetricGroupRegistration
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4872.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4872
----
commit 9106445f757099763c350b96dc0a0afad8f9b081
Author: Till <[email protected]>
Date: 2017-10-18T11:50:06Z
[FLINK-7863] Generalize MetricFetcher to work with a RestfulGateway
Add more logging to MetricFetcher
commit 577ac1d23afe9fe04d1026600e15f26270f8252e
Author: Till <[email protected]>
Date: 2017-10-19T12:51:07Z
Fix MetricFetcherTest
commit 7371c68e09ed983e21ee46110566f60b57dcf942
Author: Till <[email protected]>
Date: 2017-10-19T15:22:53Z
[FLINK-7876] Register TaskManagerMetricGroup under ResourceID
This commit changes that TaskManagerMetricGroups are now registered under
the
TaskManager's ResourceID instead of the InstanceID. This allows to create
the
TaskManagerMetricGroup at startup of the TaskManager.
Moreover, it pulls the MetricRegistry out of JobManager and TaskManager.
This
allows to reuse the same MetricRegistry across multiple instances (e.g. in
the
FlinkMiniCluster case). Moreover, it ensures proper cleanup of a potentially
started MetricyQueryServiceActor.
Change TaskManagersHandler to work with ResourceID instead of InstanceID
Adapt MetricFetcher to use ResourceID instead of InstanceID
commit 660d33ee72de982996e49a88f99591a2e2062b22
Author: Till <[email protected]>
Date: 2017-10-20T10:13:10Z
[FLINK-7876] Properly start and shutdown MetricRegistry by ClusterEntrypoint
----
> Register TaskManagerMetricGroup under ResourceID instead of InstanceID
> ----------------------------------------------------------------------
>
> Key: FLINK-7876
> URL: https://issues.apache.org/jira/browse/FLINK-7876
> Project: Flink
> Issue Type: Improvement
> Components: Metrics
> Affects Versions: 1.4.0
> Reporter: Till Rohrmann
> Assignee: Till Rohrmann
> Priority: Minor
> Labels: flip-6
>
> Currently, the {{TaskManager}} registers the {{TaskManagerMetricGroup}} under
> its {{InstanceID}} and thereby binding its metrics effectively to the
> lifetime of its registration with the {{JobManager}}. This has also
> implications how the REST handler retrieve the TaskManager metrics, namely by
> its {{InstanceID}}.
> I would actually propose to register the {{TaskManagerMetricGroup}} under the
> {{TaskManager}}/{{TaskExecutor}} {{ResourceID}} which is valid over the whole
> lifetime of the {{TaskManager}}/{{TaskExecutor}}. That way we would also be
> able to query metrics independent of the connection status to the
> {{JobManager}}.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)