Alex and I have PRs out related to supporting metrics in portable-runner
code-paths:

   - #7624 <https://github.com/apache/beam/pull/7624> associates metrics in
   the SDK harness with the (pre-fusion) PTransforms the user defined them in.
   - #7641 <https://github.com/apache/beam/pull/7641> sends metrics over
   the "Job API" (between job server and portable runner):
      - Flink portable-VR metrics tests pass (Java)
      - metrics print()s work in portable wordcount (Python)

*Open Questions:*

   - What to do with type-specific protos (e.g. IntDistributionData vs.
   DoubleDistributionData)?
      - I think Alex and I were leaning toward only supporting the
      "int"-cases for now
      - That's what Java does in its existing metrics
      <https://github.com/apache/beam/pull/7641#discussion_r251895392>
   - "MetricKey" and "MetricName" semantics:
      - These exist in Java and Python, and I added proto versions in #7641
      <https://github.com/apache/beam/pull/7641/files#r251896650>.
      - MetricName wraps "namespace" and "name" strings, and MetricKey
      wraps a "step (ptransform) name" and a MetricName.
      - PCollection-scoped metrics (e.g. element count) are identified by a
      null "step name" in #7624 <https://github.com/apache/beam/pull/7624>
       and #7641 <https://github.com/apache/beam/pull/7641>.
      - Alex and I discussed using URNs as the source of this information
      instead:
         - "step name" can instead come from a MonitoringInfo's PTRANSFORM
         label
         
<https://github.com/apache/beam/blob/efb83e6c2fe486793947f6a80bec3a61f53a06bb/model/fn-execution/src/main/proto/beam_fn_api.proto#L436>,
         while "namespace" and "name" can be parsed from its URN
         
<https://github.com/apache/beam/blob/efb83e6c2fe486793947f6a80bec3a61f53a06bb/model/fn-execution/src/main/proto/beam_fn_api.proto#L457-L482>
         .
         - URNs could encode these over the wire, then SDKs could convert
         to existing MetricKey/MetricNames for use in querying / MetricResults
         - or: we could more deeply overhaul SDKs' metrics/querying
         structures to use MonitoringInfos / URNs.
            - at the least, SDKs should get helpers for querying for Alex's
            new "system metrics" (e.g. element count, various timings
            
<https://github.com/apache/beam/blob/efb83e6c2fe486793947f6a80bec3a61f53a06bb/model/fn-execution/src/main/proto/beam_fn_api.proto#L457-L482>)
            that are associated with specific URNs
         - Gauges: the protos have a nod to sending gauges over the wire as
   counters
   
<https://github.com/apache/beam/blob/efb83e6c2fe486793947f6a80bec3a61f53a06bb/model/fn-execution/src/main/proto/beam_fn_api.proto#L506-L515>
      - are there problems with that?
      - #7641 should support this
      <https://github.com/apache/beam/pull/7641/files#r251930798>, for now.
   - ExtremaData: the protos contain these
   
<https://github.com/apache/beam/blob/efb83e6c2fe486793947f6a80bec3a61f53a06bb/model/fn-execution/src/main/proto/beam_fn_api.proto#L517-L532>,
   but SDKs don't support them (afaik).

Alex likely has more to add, and we plan to make a doc about these changes,
but I wanted to post here first in case others have thoughts or we are
overlooking anything.

Thanks!

Reply via email to