[
https://issues.apache.org/jira/browse/BEAM-4775?focusedWorklogId=197779&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-197779
]
ASF GitHub Bot logged work on BEAM-4775:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Feb/19 20:14
Start Date: 12/Feb/19 20:14
Worklog Time Spent: 10m
Work Description: ryan-williams commented on pull request #7823: [DO NOT
MERGE] [BEAM-4775] Second take on portable metrics over the job-server API
URL: https://github.com/apache/beam/pull/7823
Supersedes #7641, incorporating designs and feedback from the
https://s.apache.org/get-metrics-api doc. Submitting now for Jenkins-testing.
## Summary
- implements counter, distribution, and gauge metrics over the Job API
- "int" versions only; removed "double" versions
- works in Java; I think some of them may still be TODO in Python
- main RPC looks similar to that in the design doc, but adds an extra
`MetricResults` proto that wraps "attempted" and "committed" lists of
MonitoringInfos
- bypasses a lot of proto-additions and moves from #7641 that caused some
`import`-thrashing
- moving metrics-related protos to their own file/package may still make
sense
- this PR is a net remover of metric protos, so on balance I decided
saving such a move for later is OK
- some additions and changes MonitoringInfo and SimpleMonitoringInfoBuilder
- e.g.: expands `type_urn` to an array of `type_urns` that are all
acceptable
- for example, user metrics should allow multiple types (the URN should
also be changed to not specify counters explicitly)
- takes steps toward making `MetricKey` opaque in Java
- polymorphism over `MetricName` does most of the work of bridging between
existing Java metrics-types and MonitoringInfo-based ones
- `MonitoringInfoMetricName` is used for "system metrics", and essentially
mirrors a `MonitoringInfo`
- accesses to `getName` and `getNamespace` here are discouraged, and
will now `throw` if they are called on a system-metric
## TODO
- I think the `MetricName`-polymorphism (that distinguishes between "system"
and "user" metrics) should instead happen in `MetricKey`:
- user- and system- `MetricKey`s will always be backed by `MonitoringInfo`
constructs ({URN, labels})
- user-metric-specific accessors ({`getNamespace`,`getName`}) will still
be used by legacy code, but may throw if new system metrics get into those
code-paths.
- I moved a few metrics files from `runners/core-construction-java` to
`java/core`, but that may not be necessary; go back and undo those if they're
not.
- Make sure all the metric types are being surfaced in Python.
cc @robertwb @ajamato in case you want an early look. I'm going to start on
the TODOs above in the meantime.
Post-Commit Tests Status (on master branch)
------------------------------------------------------------------------------------------------
Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
--- | --- | --- | --- | --- | --- | --- | ---
Go | [](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/)
| --- | --- | --- | --- | --- | ---
Java | [](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
Python | [](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)
| --- | [](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)
</br> [](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/)
| [](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)
| --- | --- | ---
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 197779)
Time Spent: 4.5h (was: 4h 20m)
> JobService should support returning metrics
> -------------------------------------------
>
> Key: BEAM-4775
> URL: https://issues.apache.org/jira/browse/BEAM-4775
> Project: Beam
> Issue Type: Bug
> Components: beam-model
> Reporter: Eugene Kirpichov
> Assignee: Ryan Williams
> Priority: Major
> Labels: triaged
> Time Spent: 4.5h
> Remaining Estimate: 0h
>
> [https://github.com/apache/beam/blob/master/model/job-management/src/main/proto/beam_job_api.proto]
> currently doesn't appear to have a way for JobService to return metrics to a
> user, even though
> [https://github.com/apache/beam/blob/master/model/fn-execution/src/main/proto/beam_fn_api.proto]
> includes support for reporting SDK metrics to the runner harness.
>
> Metrics are apparently necessary to run any ValidatesRunner tests because
> PAssert needs to validate that the assertions succeeded. However, this
> statement should be double-checked: perhaps it's possible to somehow work
> with PAssert without metrics support.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)