Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Andrea Foegler Fri, 13 Apr 2018 16:53:00 -0700

That's a great summary Alex, thanks!

This doesn't address all your questions, but in terms of how I see the
MetricSpec being specified / shared is something like this:
SDKs just share the same MetricSpec file which defines all the system
metrics guaranteed by Beam.  SDK-specific additions can be handled with an
addendum.
That spec can be read by the SDK and by the Runner.  The SDK is responsible
for populating the metric values according to the spec for all system
metrics.  The runner doesn't really need the spec for user-defined metrics,
since there's really nothing to do with them but forward them along.


I think this should eliminate any concerns around misspellings and such.
It would even be pretty simple to automatically generate protos for each
system MetricSpec and the code to convert from the proto to the MetricSpec.

I do think runners should treat any metrics they don't know about just like
user metrics - metrics to be forwarded to downstream monitoring tools.

I think I'm unconvinced this Metrics API should handle cases like the I/O
files case.



On Fri, Apr 13, 2018 at 4:30 PM Alex Amato <[email protected]> wrote:

> There are a few more confusing concepts in this thread
> *Name*
>
>    - Name can mean a *"string name"* used to refer to a metric in a
>    metrics system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
>    - Name can mean a set of *context* fields added to a counter, either
>    embedded in a complex string, or in a structured name. Typically referring
>    to *aggregation entities, *which define how the metric updates get
>    aggregated into final metric values, i.e. all Metric updates with the same
>    field are aggregated together.
>       - e.g.my_ptransform_id-ElementCount
>       - e.g.{ name : 'ElementCount', 'ptransform_name' :
>       'my_ptransform_id' }
>    - The *URN* of a Metric, which identifies a proto to use in a payload
>    field for the Metric and MetricSpec. Note: The string name, can literally
>    be the URN value in most cases, except for metrics which can specify a
>    separate name (i.e. user counters).
>
>
>
> @Robert,
> You have proposed that metrics should contain the following parts, I still
> don't fully understand what you mean by each one.
>
>    - Name - Why is a name a URN + bytes payload? What type of name are
>    you referring to, *string name*? *context*? *URN*? Or something else.
>    - Entity - This is how the metric is aggregated together. If I
>    understand you correctly. And you correctly point out that a singular
>    entity is not sufficient, a set of labels may be more appropriate.
>    - Value - *Are you saying this is just the metric value, not including
>    any fields related to entity or name.*
>    - Type - I am not clear at all on what this is or what it would look
>    like. Are you referring to units, like milliseconds/seconds? Why it
>    wouldn't be part of the value payload. Is this some sort of reason to
>    separate it out from the value? What if the value has multiple fields for
>    example.
>
> Some pros and cons as I see them
> Pros:
>
>    - More separation and flexibility for an SDK to specify labels
>    separately from the value/type. Though, maybe I don't understand enough,
>    and I am not so sure this is a con over just having the URN payload contain
>    everything in itself.
>
> Cons:
>
>    - I think this means that the SDK must properly pick two separate
>    payloads and populate them correctly. We can run into issues where.
>       - Having one URN which specifies all the fields you would need to
>       populate for a specific metric avoids this, this was a concern brought 
> up
>       by Luke. The runner would then be responsible for packaging metrics up 
> to
>       send to external monitoring systems.
>
>
> @Andrea, please correct me if I misunderstand
> Thank you for the metric spec example in your last response, I think that
> makes the idea much more clear.
>
> Using your approach I see the following pros and cons
> Pros:
>
>    - Runners have a cleaner more reusable codepath to forwarding metrics
>    to external monitoring systems. This will mean less work on the runner side
>    to support each metric (perhaps none in many cases).
>    - SDKs may need less code as well to package up new metrics.
>    - As long\ as we expect SDKs to only send cataloged/requested metrics,
>    we can avoid the issues of SDKs creating too many metrics, metrics the
>    runner/engine don't understand, etc.
>
> Cons:
>
>    - Luke's concern with this approach was that this spec ends up boiling
>    down to just the name, in this case "my_timer". His concern is that with
>    many SDK implementations, we can have bugs using the wrong string name for
>    counters, or populating them with the wrong values.
>       - Note how the ParDoExecution time example in the doc lets you
>       build the group of metrics together, rather than reporting three 
> different
>       ones from SDK->Runner. This sort of thing can make it more clear how to
>       fill in metrics in the SDK side. Then the RunnerHarness is responsible 
> for
>       packaging the metrics up for monitoring systems, not the SDK side.
>    - Ruling out URNs+payloads altogether (Though, I don't think you are
>    suggesting this) is less extensible for custom runners+sdks+engines. I.e.
>    the table of I/O files example. It also rules out sending parameters for a
>    metric from the runner->SDK.
>    - Populating each metric spec in code in each Runner could be
>    similarly error prone. Instead of just stating "urn:namespace:my_timer",
>    you must specify this and each runner must get it correct:
>    - {
>       name: "my_timer"
>       labels: { "ptransform" }
>       type: GAUGE
>       value_type: int64
>       units: SECONDS
>       description: "Times my stuff"
>       }
>       - Would the MetricSpec be passed like that from the RunnerHarness
>       to the SDK? This part I am not so clear on.
>       - Do we want runners to accept and forward metrics they don't know
>    about? That was another concern, was to not accept them until both the SDK
>    and Runner have been updated to accept them. Consider the performance
>    implications of an SDK sending a noisy metric.
>
> This being said, I think some of this can be mitigated.
>
>    1. Could a URN describe a metric spec in the format you describe. I.e.
>    urn:namespace:my_timer is passed to the SDK, and it knows to somehow find a
>    catalog which tells it about the MetricSpec. This way with just a URN it
>    knows how to populate a common representation, like the MetricSpec you have
>    specified.
>    2. This common metric representation could be a specific URN+payload
>    used for most metrics, while extensible ones use a different URN+payload.
>
>
> Thanks for the discussion on this thread today. I'd like us to keep
> engaging like this to come to an agreement.
> Alex
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <[email protected]>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <[email protected]> wrote:
>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <[email protected]>
>> wrote:
>>
>>> I like the generalization from entity -> labels.  I view the purpose of
>>> those fields to provide context.  And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky.  I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID).  But then there's the
>>> context.  In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata.  Then values are reported separately.  For
>>> system metrics, the definitions can be packaged with the SDK.  For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <[email protected]> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <[email protected]>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Also, the only use for payloads is because "User Counter" is
>>>>>>> currently a single URN, rather than using the namespacing 
>>>>>>> characteristics
>>>>>>> of URNs to map user names onto distinct metric names.
>>>>>>>
>>>>>>
>>>>>> Can they be URNs? I don't see value in having a "user metric" URN
>>>>>> where you then have to look elsewhere for what the real name is.
>>>>>>
>>>>>
>>>>> Yes, that was my point with the parenthetical statement. I would
>>>>> rather have "beam:counter:user:use_provide_namespace:user_provide_name"
>>>>> than use the payload field for this. So if we're going to keep the payload
>>>>> field, we need more compelling usecases.
>>>>>
>>>>
>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>> their name.
>>>>
>>>> Kenn
>>>>
>>>>
>>>>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>>>>> parameters into a name though.) If we're going to choose names that the
>>>>>> system and sdks agree to have specific meanings, and to avoid accidental
>>>>>> collisions, making them full-fledged documented URNs has value.
>>>>>>
>>>>>
>>>>>> Value is the "payload". Likely worth changing the name to avoid
>>>>>> confusion with the payload above. It's bytes because it depends on the
>>>>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>>>>> payload). If we thing the types are generally limited, another option 
>>>>>> would
>>>>>> be a oneof field (with a bytes option just in case) for transparency. 
>>>>>> There
>>>>>> are pros and cons going this route.
>>>>>>
>>>>>> Type is what I proposed we add, instead of it being implicit in the
>>>>>> name (and unknowable if one does not recognize the name). This makes 
>>>>>> things
>>>>>> more open-ended and easier to evolve and work with.
>>>>>>
>>>>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>>>>> mentioned I think it makes sense to pull this out as a separate field,
>>>>>> especially when it makes sense to aggregate a single named counter across
>>>>>> labels as well as for a single label (e.g. execution time of composite
>>>>>> transforms).
>>>>>>
>>>>>> - Robert
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi folks -
>>>>>>>
>>>>>>> Before we totally go down the path of highly structured metric
>>>>>>> protos, I'd like to propose considering a simple metrics interface 
>>>>>>> between
>>>>>>> the SDK and the runner.  Something more generic and closer to what most
>>>>>>> monitoring systems would use.
>>>>>>>
>>>>>>> To use Spark as an example, the Metric system uses a simple metric
>>>>>>> format of name, value and type to report all metrics in a single 
>>>>>>> structure,
>>>>>>> regardless of the source or context of the metric.
>>>>>>>
>>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>>>>
>>>>>>> The subsystems have contracts for what metrics they will expose and
>>>>>>> how they are calculated:
>>>>>>>
>>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>>>>
>>>>>>> Codifying the system metrics in the SDK seems perfectly reasonable -
>>>>>>> no reason to make the notion of metric generic at that level.  But at 
>>>>>>> the
>>>>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>>>>> generic encoding of the metrics might make it easier to adapt and 
>>>>>>> maintain
>>>>>>> system.  The generic format can include information about downstream
>>>>>>> consumers, if that's useful.
>>>>>>>
>>>>>>> Spark supports a number of Metric Sinks - external monitoring
>>>>>>> systems.  If runners receive a simple list of metrics, implementing any
>>>>>>> number of Sinks for Beam would be straightforward and would generally 
>>>>>>> be a
>>>>>>> one time implementation.  If instead all system metrics are sent 
>>>>>>> embedded
>>>>>>> in a highly structured, semantically meaningful structure, runner code
>>>>>>> would need to be updated to support exporting the new metric. We seem 
>>>>>>> to be
>>>>>>> heading in the direction of "if you don't understand this metric, you 
>>>>>>> can't
>>>>>>> use it / export it".  But most systems seem to assume metrics are really
>>>>>>> simple named values that can be handled a priori.
>>>>>>>
>>>>>>> So I guess my primary question is:  Is it necessary for Beam to
>>>>>>> treat metrics as highly semantic, arbitrarily complex data?  Or could 
>>>>>>> they
>>>>>>> possibly be the sort of simple named values as they are in most 
>>>>>>> monitoring
>>>>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>>>>> code.  Is the coupling to a semantically meaningful structure between 
>>>>>>> the
>>>>>>> SDK and runner and necessary complexity?
>>>>>>>
>>>>>>> Andrea
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>>>>> into the model as one of type string-set (with union as aggregation). 
>>>>>>>>> *
>>>>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>>>>> several pieces of data per file.
>>>>>>>>>
>>>>>>>>> Are you proposing that there would be separate URNs as well for
>>>>>>>>> each entity being measured then, so the the URN defines the type of 
>>>>>>>>> entity
>>>>>>>>> being measured.
>>>>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>>>>> PCollection entities
>>>>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>>>>> PTransform entities
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>>>>> execution times are never for PCollections, and even if they were it'd 
>>>>>>>> be
>>>>>>>> semantically a very different beast (which should not re-use the same 
>>>>>>>> URN).
>>>>>>>>
>>>>>>>> *message MetricSpec {*
>>>>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>>>>> *  // For any URN that is not recognized (by whomever is
>>>>>>>>> inspecting*
>>>>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>>>>> *  // passed as-is.*
>>>>>>>>> *  string urn = 1;*
>>>>>>>>>
>>>>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>>>>> *  bytes parameters_payload = 2;*
>>>>>>>>>
>>>>>>>>> *  // (Required) A URN that describes the type of values this
>>>>>>>>> metric*
>>>>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>>>>> *}*
>>>>>>>>>
>>>>>>>>> *message Metric[Values] {*
>>>>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>>>>
>>>>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>>>>> * map<string, bytes> values;*
>>>>>>>>> This ignores the non-unqiueness of entity identifiers. This is why
>>>>>>>>> in my doc, I have specified the entity type and its string identifier
>>>>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>>>>> uniqueness is only guaranteed within a type of entity (all 
>>>>>>>>> PCollections),
>>>>>>>>> but not between entities (A Pcollection and PTransform may have the 
>>>>>>>>> same
>>>>>>>>> identifier).
>>>>>>>>>
>>>>>>>>
>>>>>>>> See above for why this is not an issue. The extra complexity (in
>>>>>>>> protos and code), the inability to use them as map keys, and the fact 
>>>>>>>> that
>>>>>>>> they'll be 100% redundant for all entities for a given metric 
>>>>>>>> convinces me
>>>>>>>> that it's not worth creating and tracking an enum for the type 
>>>>>>>> alongside
>>>>>>>> the id.
>>>>>>>>
>>>>>>>>
>>>>>>>>> *}*
>>>>>>>>>
>>>>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To Robert's proto:
>>>>>>>>>>>
>>>>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>>>>> URNs in the doc?
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I
>>>>>>>>>>>>>> was going to bring here. Let's keep it simple and use concrete 
>>>>>>>>>>>>>> use cases to
>>>>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom
>>>>>>>>>>>>>>> metrics of supported type" didn't include new ways of 
>>>>>>>>>>>>>>> aggregating ints. As
>>>>>>>>>>>>>>> long as that means we have a fixed set of aggregations (that 
>>>>>>>>>>>>>>> align with
>>>>>>>>>>>>>>> what what users want and metrics back end support) it seems 
>>>>>>>>>>>>>>> like we are
>>>>>>>>>>>>>>> doing user metrics right.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Ben
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters
>>>>>>>>>>>>>>>> are used a lot but others are less mainstream so being too 
>>>>>>>>>>>>>>>> fine from the
>>>>>>>>>>>>>>>> start can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <
>>>>>>>>>>>>>>>> [email protected]> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be 
>>>>>>>>>>>>>>>>> a type, as
>>>>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result
>>>>>>>>>>>>>>>>>> type, or accumulator strategy? Specifically, what is the 
>>>>>>>>>>>>>>>>>> "type" of sumint,
>>>>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the 
>>>>>>>>>>>>>>>>>>> road (and may
>>>>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support 
>>>>>>>>>>>>>>>>>>> custom metrics of
>>>>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined
>>>>>>>>>>>>>>>>>>>> metric types based on the fact they just weren't used 
>>>>>>>>>>>>>>>>>>>> enough to justify
>>>>>>>>>>>>>>>>>>>> support.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set 
>>>>>>>>>>>>>>>>>>>> plus any special
>>>>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric 
>>>>>>>>>>>>>>>>>>>>> types. I would
>>>>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an 
>>>>>>>>>>>>>>>>>>>>> additional field "type"
>>>>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. 
>>>>>>>>>>>>>>>>>>>>> the contents of
>>>>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing 
>>>>>>>>>>>>>>>>>>>>> or maxing over
>>>>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue 
>>>>>>>>>>>>>>>>>>>>> making this opaque
>>>>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative 
>>>>>>>>>>>>>>>>>>>>> (and difficult)
>>>>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at 
>>>>>>>>>>>>>>>>>>>>> least aggregate and
>>>>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself 
>>>>>>>>>>>>>>>>>>>>> understand the
>>>>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of 
>>>>>>>>>>>>>>>>>>>>> the specialization
>>>>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand 
>>>>>>>>>>>>>>>>>>>>> as well.)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric 
>>>>>>>>>>>>>>>>>>>>> and it spec would
>>>>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) 
>>>>>>>>>>>>>>>>>>>>> name.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have 
>>>>>>>>>>>>>>>>>>>>>> restructured some of the
>>>>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom 
>>>>>>>>>>>>>>>>>>>>>> aggregation
>>>>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we 
>>>>>>>>>>>>>>>>>>>>>> can design an
>>>>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, 
>>>>>>>>>>>>>>>>>>>>>> without
>>>>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided 
>>>>>>>>>>>>>>>>>>>>>> code which then
>>>>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform 
>>>>>>>>>>>>>>>>>>>>>> the custom
>>>>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform 
>>>>>>>>>>>>>>>>>>>>>> names inside
>>>>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for 
>>>>>>>>>>>>>>>>>>>>>> the metrics as the
>>>>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to
>>>>>>>>>>>>>>>>>>>>>> draw attention to this particular revision.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to
>>>>>>>>>>>>>>>>>>>>>>> make a decision by Friday, and begin working on related 
>>>>>>>>>>>>>>>>>>>>>>> PRs next week.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback
>>>>>>>>>>>>>>>>>>>>>>> before then and I will post the final decisions made to 
>>>>>>>>>>>>>>>>>>>>>>> this thread Friday
>>>>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to
>>>>>>>>>>>>>>>>>>>>>>>> it easily in
>>>>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some
>>>>>>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed
>>>>>>>>>>>>>>>>>>>>>>>> my recommendation as well
>>>>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could
>>>>>>>>>>>>>>>>>>>>>>>> please take another look after
>>>>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Reply via email to