Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-19 Thread Alex Amato
Hello,

I have rewritten most of the proposal. Though I think that there is some
more research that needs to be done to get the Metric specification
perfect. I plan to do more research, and would like to ask you all for more
help to make this proposal better. In particular, now that the metrics
format by default is designed to allow metrics to pass through to
monitoring collection systems such as Dropwizard and Stackdriver, they need
to be complete enough to be compatible with these systems.

I think some changes will be needed to fulfill this, but I wanted to send
out this document, which contains the general idea, and continue refining
it.

Please take a look and let me know what you think.
https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit

Major Revision: April 17, 2018

The design has been reworked, to use a metric format which resembles
Dropwizard and Stackdriver formats, allowing metrics to be passed through.

The generic bytes payload style of metrics is still available but is
reserved for complex use cases which do not fit into these typical metrics
collection systems.

Note: This document isn’t 100% complete, there are a few areas which need
to be improved, though our discussion and more research I want to complete
these details. Please share any thoughts that you have.

   1.

   The metric specification and Metric proto schemas may need revisions:
   1.

  The distribution format needs to be refined so that its compatible
  with Stackdriver and Dropwizard. The current example format is. A second
  distribution format need.
  2.

  Annotations needs to be examine in detail, if there are first class
  annotations which should be supported to pass through properly to
  Dropwizard and Stackdriver.
  3.

  Aggregation functions may need parameters. For example Top(n) may
  need to be parameterized. How should this best be supported.






On Tue, Apr 17, 2018 at 11:10 AM Ben Chambers  wrote:

> That sounds like a very reasonable choice -- given the discussion seemed
> to be focusing on the differences between these two categories, separating
> them will allow the proposal (and implementation) to address each category
> in the best way possible without needing to make compromises.
>
> Looking forward to the updated proposal.
>
> On Tue, Apr 17, 2018 at 10:53 AM Alex Amato  wrote:
>
>> Hello,
>>
>> I just wanted to give an update .
>>
>> After some discussion, I've realized that its best to break up the two
>> concepts, with two separate way of reporting monitoring data. These two
>> categories are:
>>
>>1. Metrics - Counters, Gauges, Distributions. These are well defined
>>concepts for monitoring information and ned to integrate with existing
>>metrics collection systems such as Dropwizard and Stackdriver. Most 
>> metrics
>>will go through this model, which will allow runners to process new 
>> metrics
>>without adding extra code to support them, forwarding them to metric
>>collection systems.
>>2. Monitoring State - This supports general monitoring data which may
>>not fit into the standard model for Metrics. For example an I/O source may
>>provide a table of filenames+metadata, for files which are old and 
>> blocking
>>the system. I will propose a general approach, similar to the URN+payload
>>approach used in the doc right now.
>>
>> One thing to keep in mind -- even though it makes sense to allow each I/O
> source to define their own monitoring state, this then shifts
> responsibility for collecting that information to each runner and
> displaying that information to every consumer. It would be reasonable to
> see if there could be a set of 10 or so that covered most of the cases that
> could become the "standard" set (eg., watermark information, performance
> information, etc.).
>
>
>> I will rewrite most of the doc and propose separating these two very
>> different use cases, one which optimizes for integration with existing
>> monitoring systems. The other which optimizes for flexibility, allowing
>> more complex and custom metrics formats for other debugging scenarios.
>>
>> I just wanted to give a brief update on the direction of this change,
>> before writing it up in full detail.
>>
>>
>> On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw 
>> wrote:
>>
>>> I agree that the user/system dichotomy is false, the real question of
>>> how counters can be scoped to avoid accidental (or even intentional)
>>> interference. A system that entirely controls the interaction between the
>>> "user" (from its perspective) and the underlying system can do this by
>>> prefixing all requested "user" counters with a prefix it will not use
>>> itself. Of course this breaks down whenever the wrapping isn't complete
>>> (either on the production or consumption side), but may be worth doing for
>>> some components (like the SDKs that 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-17 Thread Ben Chambers
That sounds like a very reasonable choice -- given the discussion seemed to
be focusing on the differences between these two categories, separating
them will allow the proposal (and implementation) to address each category
in the best way possible without needing to make compromises.

Looking forward to the updated proposal.

On Tue, Apr 17, 2018 at 10:53 AM Alex Amato  wrote:

> Hello,
>
> I just wanted to give an update .
>
> After some discussion, I've realized that its best to break up the two
> concepts, with two separate way of reporting monitoring data. These two
> categories are:
>
>1. Metrics - Counters, Gauges, Distributions. These are well defined
>concepts for monitoring information and ned to integrate with existing
>metrics collection systems such as Dropwizard and Stackdriver. Most metrics
>will go through this model, which will allow runners to process new metrics
>without adding extra code to support them, forwarding them to metric
>collection systems.
>2. Monitoring State - This supports general monitoring data which may
>not fit into the standard model for Metrics. For example an I/O source may
>provide a table of filenames+metadata, for files which are old and blocking
>the system. I will propose a general approach, similar to the URN+payload
>approach used in the doc right now.
>
> One thing to keep in mind -- even though it makes sense to allow each I/O
source to define their own monitoring state, this then shifts
responsibility for collecting that information to each runner and
displaying that information to every consumer. It would be reasonable to
see if there could be a set of 10 or so that covered most of the cases that
could become the "standard" set (eg., watermark information, performance
information, etc.).


> I will rewrite most of the doc and propose separating these two very
> different use cases, one which optimizes for integration with existing
> monitoring systems. The other which optimizes for flexibility, allowing
> more complex and custom metrics formats for other debugging scenarios.
>
> I just wanted to give a brief update on the direction of this change,
> before writing it up in full detail.
>
>
> On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw 
> wrote:
>
>> I agree that the user/system dichotomy is false, the real question of how
>> counters can be scoped to avoid accidental (or even intentional)
>> interference. A system that entirely controls the interaction between the
>> "user" (from its perspective) and the underlying system can do this by
>> prefixing all requested "user" counters with a prefix it will not use
>> itself. Of course this breaks down whenever the wrapping isn't complete
>> (either on the production or consumption side), but may be worth doing for
>> some components (like the SDKs that value being able to provide this
>> isolation for better behavior). Actual (human) end users are likely to be
>> much less careful about avoiding conflicts than library authors who in turn
>> are generally less careful than authors of the system itself.
>>
>> We could alternatively allow for specifying fully qualified URNs for
>> counter names in the SDK APIs, and letting "normal" user counters be in the
>> empty namespace rather than something like beam:metrics:{user,other,...},
>> perhaps with SDKs prohibiting certain conflicting prefixes (which is less
>> than ideal). A layer above the SDK that has similar absolute control over
>> its "users" would have a similar decision to make.
>>
>>
>> On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles  wrote:
>>
>>> One reason I resist the user/system distinction is that Beam is a
>>> multi-party system with at least SDK, runner, and pipeline. Often there may
>>> be a DSL like SQL or Scio, or similarly someone may be building a platform
>>> for their company where there is no user authoring the pipeline. Should
>>> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
>>> tack on the prefix? It looks like it is the SDK harness? Are there just
>>> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
>>> of as "user" version "system" should simply be the different between
>>> dynamically defined & typed metrics and fields in control plane protos. If
>>> that layer of the namespaces is not finite and limited, who can extend make
>>> a valid extension? Just some questions that I think would flesh out the
>>> meaning of the "user" prefix.
>>>
>>> Kenn
>>>
>>> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler 
>>> wrote:
>>>


 On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw 
 wrote:

> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler 
> wrote:
>
>> Thanks, Robert!
>>
>> I think my lack of clarity is around the MetricSpec.  Maybe what's in
>> my head and what's being proposed are the 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-17 Thread Alex Amato
Hello,

I just wanted to give an update .

After some discussion, I've realized that its best to break up the two
concepts, with two separate way of reporting monitoring data. These two
categories are:

   1. Metrics - Counters, Gauges, Distributions. These are well defined
   concepts for monitoring information and ned to integrate with existing
   metrics collection systems such as Dropwizard and Stackdriver. Most metrics
   will go through this model, which will allow runners to process new metrics
   without adding extra code to support them, forwarding them to metric
   collection systems.
   2. Monitoring State - This supports general monitoring data which may
   not fit into the standard model for Metrics. For example an I/O source may
   provide a table of filenames+metadata, for files which are old and blocking
   the system. I will propose a general approach, similar to the URN+payload
   approach used in the doc right now.

I will rewrite most of the doc and propose separating these two very
different use cases, one which optimizes for integration with existing
monitoring systems. The other which optimizes for flexibility, allowing
more complex and custom metrics formats for other debugging scenarios.

I just wanted to give a brief update on the direction of this change,
before writing it up in full detail.


On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw 
wrote:

> I agree that the user/system dichotomy is false, the real question of how
> counters can be scoped to avoid accidental (or even intentional)
> interference. A system that entirely controls the interaction between the
> "user" (from its perspective) and the underlying system can do this by
> prefixing all requested "user" counters with a prefix it will not use
> itself. Of course this breaks down whenever the wrapping isn't complete
> (either on the production or consumption side), but may be worth doing for
> some components (like the SDKs that value being able to provide this
> isolation for better behavior). Actual (human) end users are likely to be
> much less careful about avoiding conflicts than library authors who in turn
> are generally less careful than authors of the system itself.
>
> We could alternatively allow for specifying fully qualified URNs for
> counter names in the SDK APIs, and letting "normal" user counters be in the
> empty namespace rather than something like beam:metrics:{user,other,...},
> perhaps with SDKs prohibiting certain conflicting prefixes (which is less
> than ideal). A layer above the SDK that has similar absolute control over
> its "users" would have a similar decision to make.
>
>
> On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles  wrote:
>
>> One reason I resist the user/system distinction is that Beam is a
>> multi-party system with at least SDK, runner, and pipeline. Often there may
>> be a DSL like SQL or Scio, or similarly someone may be building a platform
>> for their company where there is no user authoring the pipeline. Should
>> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
>> tack on the prefix? It looks like it is the SDK harness? Are there just
>> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
>> of as "user" version "system" should simply be the different between
>> dynamically defined & typed metrics and fields in control plane protos. If
>> that layer of the namespaces is not finite and limited, who can extend make
>> a valid extension? Just some questions that I think would flesh out the
>> meaning of the "user" prefix.
>>
>> Kenn
>>
>> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler 
>> wrote:
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw 
>>> wrote:
>>>
 On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler 
 wrote:

> Thanks, Robert!
>
> I think my lack of clarity is around the MetricSpec.  Maybe what's in
> my head and what's being proposed are the same thing.  When I read that 
> the
> MetricSpec describes the proto structure, that sound kind of complicated 
> to
> me.  But I may be misinterpreting it.  What I picture is something like a
> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
> :):
>
> {
> name: "my_timer"
>

 name: "beam:metric:user:my_namespace:my_timer" (assuming we want to
 keep requiring namespaces). Or "beam:metric:[some non-user designation]"

>>>
>>> Sure. Looks good.
>>>
>>>

 labels: { "ptransform" }
>

 How does an SDK act on this information?

>>>
>>> The SDK is obligated to submit any metric values for that spec with a
>>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>>> from the spec to avoid typos should be easy.
>>>
>>>


> type: GAUGE
> value_type: int64
>

 I was lumping type and value_type into the 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-16 Thread Robert Bradshaw
I agree that the user/system dichotomy is false, the real question of how
counters can be scoped to avoid accidental (or even intentional)
interference. A system that entirely controls the interaction between the
"user" (from its perspective) and the underlying system can do this by
prefixing all requested "user" counters with a prefix it will not use
itself. Of course this breaks down whenever the wrapping isn't complete
(either on the production or consumption side), but may be worth doing for
some components (like the SDKs that value being able to provide this
isolation for better behavior). Actual (human) end users are likely to be
much less careful about avoiding conflicts than library authors who in turn
are generally less careful than authors of the system itself.

We could alternatively allow for specifying fully qualified URNs for
counter names in the SDK APIs, and letting "normal" user counters be in the
empty namespace rather than something like beam:metrics:{user,other,...},
perhaps with SDKs prohibiting certain conflicting prefixes (which is less
than ideal). A layer above the SDK that has similar absolute control over
its "users" would have a similar decision to make.


On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles  wrote:

> One reason I resist the user/system distinction is that Beam is a
> multi-party system with at least SDK, runner, and pipeline. Often there may
> be a DSL like SQL or Scio, or similarly someone may be building a platform
> for their company where there is no user authoring the pipeline. Should
> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
> tack on the prefix? It looks like it is the SDK harness? Are there just
> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
> of as "user" version "system" should simply be the different between
> dynamically defined & typed metrics and fields in control plane protos. If
> that layer of the namespaces is not finite and limited, who can extend make
> a valid extension? Just some questions that I think would flesh out the
> meaning of the "user" prefix.
>
> Kenn
>
> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler  wrote:
>
>>
>>
>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw 
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler 
>>> wrote:
>>>
 Thanks, Robert!

 I think my lack of clarity is around the MetricSpec.  Maybe what's in
 my head and what's being proposed are the same thing.  When I read that the
 MetricSpec describes the proto structure, that sound kind of complicated to
 me.  But I may be misinterpreting it.  What I picture is something like a
 MetricSpec that looks like (note: my picture looks a lot like Stackdriver
 :):

 {
 name: "my_timer"

>>>
>>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
>>> requiring namespaces). Or "beam:metric:[some non-user designation]"
>>>
>>
>> Sure. Looks good.
>>
>>
>>>
>>> labels: { "ptransform" }

>>>
>>> How does an SDK act on this information?
>>>
>>
>> The SDK is obligated to submit any metric values for that spec with a
>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>> from the spec to avoid typos should be easy.
>>
>>
>>>
>>>
 type: GAUGE
 value_type: int64

>>>
>>> I was lumping type and value_type into the same field, as a urn for
>>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>>> distributions).
>>>
>>
>> My inclination is that keeping this set relatively small and fixed to a
>> set that can be readily exported to external monitoring systems is more
>> useful than the added indirection to support extensibility.  Lumping
>> together seems reasonable.
>>
>>
>>>
>>>
 units: SECONDS
 description: "Times my stuff"

>>>
>>> Are both of these optional metadata, in the form of key-value field, for
>>> flattened into the field itself (along with every other kind of metadata
>>> you may want to attach)?
>>>
>>
>> Optional metadata in the form of fixed fields.  Is there a use case for
>> arbitrary metadata?  What would you do with it when exporting?
>>
>>
>>>
>>>
 }

 Then metrics submitted would look like:
 {
 name: "my_timer"
 labels: {"ptransform": "MyTransform"}
 int_value: 100
 }

>>>
>>> Yes, or value could be a bytes field that is encoded according to
>>> [value_]type above, if we want that extensibility (e.g. if we want to
>>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>>> that seems to specific to hard code into the basic structure).
>>>
>>>
>> The simplicity coming from the fact that there's only one proto format
 for the spec and for the value.  The only thing that varies are the entries
 in the map and the value field set.  It's pretty easy to establish
 contracts around this type of spec and even 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-14 Thread Kenneth Knowles
One reason I resist the user/system distinction is that Beam is a
multi-party system with at least SDK, runner, and pipeline. Often there may
be a DSL like SQL or Scio, or similarly someone may be building a platform
for their company where there is no user authoring the pipeline. Should
Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
tack on the prefix? It looks like it is the SDK harness? Are there just
three namespaces "runner", "sdk", and "user"?  Most of what you'd think of
as "user" version "system" should simply be the different between
dynamically defined & typed metrics and fields in control plane protos. If
that layer of the namespaces is not finite and limited, who can extend make
a valid extension? Just some questions that I think would flesh out the
meaning of the "user" prefix.

Kenn

On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler  wrote:

>
>
> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw 
> wrote:
>
>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler 
>> wrote:
>>
>>> Thanks, Robert!
>>>
>>> I think my lack of clarity is around the MetricSpec.  Maybe what's in my
>>> head and what's being proposed are the same thing.  When I read that the
>>> MetricSpec describes the proto structure, that sound kind of complicated to
>>> me.  But I may be misinterpreting it.  What I picture is something like a
>>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>>> :):
>>>
>>> {
>>> name: "my_timer"
>>>
>>
>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
>> requiring namespaces). Or "beam:metric:[some non-user designation]"
>>
>
> Sure. Looks good.
>
>
>>
>> labels: { "ptransform" }
>>>
>>
>> How does an SDK act on this information?
>>
>
> The SDK is obligated to submit any metric values for that spec with a
> "ptransform" -> "transformName" in the labels field.  Autogenerating code
> from the spec to avoid typos should be easy.
>
>
>>
>>
>>> type: GAUGE
>>> value_type: int64
>>>
>>
>> I was lumping type and value_type into the same field, as a urn for
>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>> distributions).
>>
>
> My inclination is that keeping this set relatively small and fixed to a
> set that can be readily exported to external monitoring systems is more
> useful than the added indirection to support extensibility.  Lumping
> together seems reasonable.
>
>
>>
>>
>>> units: SECONDS
>>> description: "Times my stuff"
>>>
>>
>> Are both of these optional metadata, in the form of key-value field, for
>> flattened into the field itself (along with every other kind of metadata
>> you may want to attach)?
>>
>
> Optional metadata in the form of fixed fields.  Is there a use case for
> arbitrary metadata?  What would you do with it when exporting?
>
>
>>
>>
>>> }
>>>
>>> Then metrics submitted would look like:
>>> {
>>> name: "my_timer"
>>> labels: {"ptransform": "MyTransform"}
>>> int_value: 100
>>> }
>>>
>>
>> Yes, or value could be a bytes field that is encoded according to
>> [value_]type above, if we want that extensibility (e.g. if we want to
>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>> that seems to specific to hard code into the basic structure).
>>
>>
> The simplicity coming from the fact that there's only one proto format for
>>> the spec and for the value.  The only thing that varies are the entries in
>>> the map and the value field set.  It's pretty easy to establish contracts
>>> around this type of spec and even generate protos for use the in SDK that
>>> make the expectations explicit.
>>>
>>>
>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw 
>>> wrote:
>>>
 On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:

>
> Or just "beam:counter::" or even
> "beam:metric::" since metrics have a type separate from
> their name.
>

 I proposed keeping the "user" in there to avoid possible clashes with
 the system namespaces. (No preference on counter vs. metric, I wasn't
 trying to imply counter = SumInts)


 On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler 
 wrote:

> I like the generalization from entity -> labels.  I view the purpose
> of those fields to provide context.  And labels feel like they supports a
> richer set of contexts.
>

 If we think such a generalization provides value, I'm fine with doing
 that now, as sets or key-value maps, if we have good enough examples to
 justify this.


> The URN concept gets a little tricky.  I totally agree that the
> context fields should not be embedded in the name.
> There's a "name" which is the identifier that can be used to
> communicate what context values are supported / allowed for metrics with
> that name (for example, element_count expects a ptransform ID).  

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 4:30 PM Alex Amato  wrote:

> There are a few more confusing concepts in this thread
> *Name*
>
>- Name can mean a *"string name"* used to refer to a metric in a
>metrics system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
>- Name can mean a set of *context* fields added to a counter, either
>embedded in a complex string, or in a structured name. Typically referring
>to *aggregation entities, *which define how the metric updates get
>aggregated into final metric values, i.e. all Metric updates with the same
>field are aggregated together.
>   - e.g.my_ptransform_id-ElementCount
>   - e.g.{ name : 'ElementCount', 'ptransform_name' :
>   'my_ptransform_id' }
>- The *URN* of a Metric, which identifies a proto to use in a payload
>field for the Metric and MetricSpec. Note: The string name, can literally
>be the URN value in most cases, except for metrics which can specify a
>separate name (i.e. user counters).
>
> @Robert,
> You have proposed that metrics should contain the following parts, I still
> don't fully understand what you mean by each one.
>
>- Name - Why is a name a URN + bytes payload? What type of name are
>you referring to, *string name*? *context*? *URN*? Or something else.
>
> As you say above, the URN can literally be the string name. I see no
reason why this can't be the case for user counters as well (the user
counter name becoming part of the urn). The payload, should we decide to
keep it, is "part" of the name because it helps identify what exactly we're
counting. I.e. {urnX, payload1} would be distinct from {urnX, payload2}.
The only reason to have a payload is to avoid sticking stuff that would be
ugly to parse into the URN.

>
>- Entity - This is how the metric is aggregated together. If I
>understand you correctly. And you correctly point out that a singular
>entity is not sufficient, a set of labels may be more appropriate.
>
> Alternatively, the entity/labels specifies possible sub-partitions of the
metric identified by its name (as above).

>
>- Value - *Are you saying this is just the metric value, not including
>any fields related to entity or name.*
>
> Exactly. Like "5077." For some types it would be composite. The type also
indicates how it's encoded (e.g. as bytes, or which field of a oneof should
be populated).

>
>- Type - I am not clear at all on what this is or what it would look
>like. Are you referring to units, like milliseconds/seconds? Why it
>wouldn't be part of the value payload. Is this some sort of reason to
>separate it out from the value? What if the value has multiple fields for
>example.
>
> Type would be "beam:metric_type:sum:ints" or
"beam:metric_type:distribution:doubles." We could separate "data type" from
"aggregation type" if desired, though of course the full cross-product
doesn't makes sense. We could put the unit in the type (e.g. sum_durations
!= sum_ints), but, preferably, I'd put this as metadata on the counter
spec. It is often fully determined by the URN, but provided so one can
reason about the metric without having to interpret the URN. It also means
we don't have to have a separate URN for each user metric type. (In fact,
any metric the runner doesn't understand would be treated as a user metric,
and aggregated as such if it understand the type.)

Some pros and cons as I see them
> Pros:
>
>- More separation and flexibility for an SDK to specify labels
>separately from the value/type. Though, maybe I don't understand enough,
>and I am not so sure this is a con over just having the URN payload contain
>everything in itself.
>
> We can't interpret a URN payload unless we know the URN. Separating things
out allows us to act on metrics without interpreting the URN (both for
unknown URNs, and simplifying the logic by not having to do lookups on the
URN everywhere).


> Cons:
>
>- I think this means that the SDK must properly pick two separate
>payloads and populate them correctly. We can run into issues where.
>   - Having one URN which specifies all the fields you would need to
>   populate for a specific metric avoids this, this was a concern brought 
> up
>   by Luke. The runner would then be responsible for packaging metrics up 
> to
>   send to external monitoring systems.
>
> I'm not following you here. We'd return exactly what Andrea suggested.


>
> @Andrea, please correct me if I misunderstand
> Thank you for the metric spec example in your last response, I think that
> makes the idea much more clear.
>
> Using your approach I see the following pros and cons
> Pros:
>
>- Runners have a cleaner more reusable codepath to forwarding metrics
>to external monitoring systems. This will mean less work on the runner side
>to support each metric (perhaps none in many cases).
>- SDKs may need less code as well to package up new 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler  wrote:

> Thanks, Robert!
>
> I think my lack of clarity is around the MetricSpec.  Maybe what's in my
> head and what's being proposed are the same thing.  When I read that the
> MetricSpec describes the proto structure, that sound kind of complicated to
> me.  But I may be misinterpreting it.  What I picture is something like a
> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
> :):
>
> {
> name: "my_timer"
>

name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
requiring namespaces). Or "beam:metric:[some non-user designation]"

labels: { "ptransform" }
>

How does an SDK act on this information?


> type: GAUGE
> value_type: int64
>

I was lumping type and value_type into the same field, as a urn for
possibly extensibility, as they're tightly coupled (e.g. quantiles,
distributions).


> units: SECONDS
> description: "Times my stuff"
>

Are both of these optional metadata, in the form of key-value field, for
flattened into the field itself (along with every other kind of metadata
you may want to attach)?


> }
>
> Then metrics submitted would look like:
> {
> name: "my_timer"
> labels: {"ptransform": "MyTransform"}
> int_value: 100
> }
>

Yes, or value could be a bytes field that is encoded according to
[value_]type above, if we want that extensibility (e.g. if we want to
bundle the pardo sub-timings together, we'd need a proto for the value, but
that seems to specific to hard code into the basic structure).


> The simplicity coming from the fact that there's only one proto format for
> the spec and for the value.  The only thing that varies are the entries in
> the map and the value field set.  It's pretty easy to establish contracts
> around this type of spec and even generate protos for use the in SDK that
> make the expectations explicit.
>
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw 
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:
>>
>>>
>>> Or just "beam:counter::" or even
>>> "beam:metric::" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler 
>> wrote:
>>
>>> I like the generalization from entity -> labels.  I view the purpose of
>>> those fields to provide context.  And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky.  I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID).  But then there's the
>>> context.  In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata.  Then values are reported separately.  For
>>> system metrics, the definitions can be packaged with the SDK.  For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Andrea Foegler
That's a great summary Alex, thanks!

This doesn't address all your questions, but in terms of how I see the
MetricSpec being specified / shared is something like this:
SDKs just share the same MetricSpec file which defines all the system
metrics guaranteed by Beam.  SDK-specific additions can be handled with an
addendum.
That spec can be read by the SDK and by the Runner.  The SDK is responsible
for populating the metric values according to the spec for all system
metrics.  The runner doesn't really need the spec for user-defined metrics,
since there's really nothing to do with them but forward them along.

I think this should eliminate any concerns around misspellings and such.
It would even be pretty simple to automatically generate protos for each
system MetricSpec and the code to convert from the proto to the MetricSpec.

I do think runners should treat any metrics they don't know about just like
user metrics - metrics to be forwarded to downstream monitoring tools.

I think I'm unconvinced this Metrics API should handle cases like the I/O
files case.



On Fri, Apr 13, 2018 at 4:30 PM Alex Amato  wrote:

> There are a few more confusing concepts in this thread
> *Name*
>
>- Name can mean a *"string name"* used to refer to a metric in a
>metrics system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
>- Name can mean a set of *context* fields added to a counter, either
>embedded in a complex string, or in a structured name. Typically referring
>to *aggregation entities, *which define how the metric updates get
>aggregated into final metric values, i.e. all Metric updates with the same
>field are aggregated together.
>   - e.g.my_ptransform_id-ElementCount
>   - e.g.{ name : 'ElementCount', 'ptransform_name' :
>   'my_ptransform_id' }
>- The *URN* of a Metric, which identifies a proto to use in a payload
>field for the Metric and MetricSpec. Note: The string name, can literally
>be the URN value in most cases, except for metrics which can specify a
>separate name (i.e. user counters).
>
>
>
> @Robert,
> You have proposed that metrics should contain the following parts, I still
> don't fully understand what you mean by each one.
>
>- Name - Why is a name a URN + bytes payload? What type of name are
>you referring to, *string name*? *context*? *URN*? Or something else.
>- Entity - This is how the metric is aggregated together. If I
>understand you correctly. And you correctly point out that a singular
>entity is not sufficient, a set of labels may be more appropriate.
>- Value - *Are you saying this is just the metric value, not including
>any fields related to entity or name.*
>- Type - I am not clear at all on what this is or what it would look
>like. Are you referring to units, like milliseconds/seconds? Why it
>wouldn't be part of the value payload. Is this some sort of reason to
>separate it out from the value? What if the value has multiple fields for
>example.
>
> Some pros and cons as I see them
> Pros:
>
>- More separation and flexibility for an SDK to specify labels
>separately from the value/type. Though, maybe I don't understand enough,
>and I am not so sure this is a con over just having the URN payload contain
>everything in itself.
>
> Cons:
>
>- I think this means that the SDK must properly pick two separate
>payloads and populate them correctly. We can run into issues where.
>   - Having one URN which specifies all the fields you would need to
>   populate for a specific metric avoids this, this was a concern brought 
> up
>   by Luke. The runner would then be responsible for packaging metrics up 
> to
>   send to external monitoring systems.
>
>
> @Andrea, please correct me if I misunderstand
> Thank you for the metric spec example in your last response, I think that
> makes the idea much more clear.
>
> Using your approach I see the following pros and cons
> Pros:
>
>- Runners have a cleaner more reusable codepath to forwarding metrics
>to external monitoring systems. This will mean less work on the runner side
>to support each metric (perhaps none in many cases).
>- SDKs may need less code as well to package up new metrics.
>- As long\ as we expect SDKs to only send cataloged/requested metrics,
>we can avoid the issues of SDKs creating too many metrics, metrics the
>runner/engine don't understand, etc.
>
> Cons:
>
>- Luke's concern with this approach was that this spec ends up boiling
>down to just the name, in this case "my_timer". His concern is that with
>many SDK implementations, we can have bugs using the wrong string name for
>counters, or populating them with the wrong values.
>   - Note how the ParDoExecution time example in the doc lets you
>   build the group of metrics together, rather than reporting three 
> different
>   ones from SDK->Runner. 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Alex Amato
There are a few more confusing concepts in this thread
*Name*

   - Name can mean a *"string name"* used to refer to a metric in a metrics
   system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
   - Name can mean a set of *context* fields added to a counter, either
   embedded in a complex string, or in a structured name. Typically referring
   to *aggregation entities, *which define how the metric updates get
   aggregated into final metric values, i.e. all Metric updates with the same
   field are aggregated together.
  - e.g.my_ptransform_id-ElementCount
  - e.g.{ name : 'ElementCount', 'ptransform_name' : 'my_ptransform_id'
  }
   - The *URN* of a Metric, which identifies a proto to use in a payload
   field for the Metric and MetricSpec. Note: The string name, can literally
   be the URN value in most cases, except for metrics which can specify a
   separate name (i.e. user counters).



@Robert,
You have proposed that metrics should contain the following parts, I still
don't fully understand what you mean by each one.

   - Name - Why is a name a URN + bytes payload? What type of name are you
   referring to, *string name*? *context*? *URN*? Or something else.
   - Entity - This is how the metric is aggregated together. If I
   understand you correctly. And you correctly point out that a singular
   entity is not sufficient, a set of labels may be more appropriate.
   - Value - *Are you saying this is just the metric value, not including
   any fields related to entity or name.*
   - Type - I am not clear at all on what this is or what it would look
   like. Are you referring to units, like milliseconds/seconds? Why it
   wouldn't be part of the value payload. Is this some sort of reason to
   separate it out from the value? What if the value has multiple fields for
   example.

Some pros and cons as I see them
Pros:

   - More separation and flexibility for an SDK to specify labels
   separately from the value/type. Though, maybe I don't understand enough,
   and I am not so sure this is a con over just having the URN payload contain
   everything in itself.

Cons:

   - I think this means that the SDK must properly pick two separate
   payloads and populate them correctly. We can run into issues where.
  - Having one URN which specifies all the fields you would need to
  populate for a specific metric avoids this, this was a concern brought up
  by Luke. The runner would then be responsible for packaging metrics up to
  send to external monitoring systems.


@Andrea, please correct me if I misunderstand
Thank you for the metric spec example in your last response, I think that
makes the idea much more clear.

Using your approach I see the following pros and cons
Pros:

   - Runners have a cleaner more reusable codepath to forwarding metrics to
   external monitoring systems. This will mean less work on the runner side to
   support each metric (perhaps none in many cases).
   - SDKs may need less code as well to package up new metrics.
   - As long\ as we expect SDKs to only send cataloged/requested metrics,
   we can avoid the issues of SDKs creating too many metrics, metrics the
   runner/engine don't understand, etc.

Cons:

   - Luke's concern with this approach was that this spec ends up boiling
   down to just the name, in this case "my_timer". His concern is that with
   many SDK implementations, we can have bugs using the wrong string name for
   counters, or populating them with the wrong values.
  - Note how the ParDoExecution time example in the doc lets you build
  the group of metrics together, rather than reporting three different ones
  from SDK->Runner. This sort of thing can make it more clear how
to fill in
  metrics in the SDK side. Then the RunnerHarness is responsible for
  packaging the metrics up for monitoring systems, not the SDK side.
   - Ruling out URNs+payloads altogether (Though, I don't think you are
   suggesting this) is less extensible for custom runners+sdks+engines. I.e.
   the table of I/O files example. It also rules out sending parameters for a
   metric from the runner->SDK.
   - Populating each metric spec in code in each Runner could be similarly
   error prone. Instead of just stating "urn:namespace:my_timer", you must
   specify this and each runner must get it correct:
   - {
  name: "my_timer"
  labels: { "ptransform" }
  type: GAUGE
  value_type: int64
  units: SECONDS
  description: "Times my stuff"
  }
  - Would the MetricSpec be passed like that from the RunnerHarness to
  the SDK? This part I am not so clear on.
  - Do we want runners to accept and forward metrics they don't know
   about? That was another concern, was to not accept them until both the SDK
   and Runner have been updated to accept them. Consider the performance
   implications of an SDK sending a noisy metric.

This being said, I think some of this can be mitigated.

   1. Could a 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Andrea Foegler
Thanks, Robert!

I think my lack of clarity is around the MetricSpec.  Maybe what's in my
head and what's being proposed are the same thing.  When I read that the
MetricSpec describes the proto structure, that sound kind of complicated to
me.  But I may be misinterpreting it.  What I picture is something like a
MetricSpec that looks like (note: my picture looks a lot like Stackdriver
:):

{
name: "my_timer"
labels: { "ptransform" }
type: GAUGE
value_type: int64
units: SECONDS
description: "Times my stuff"
}

Then metrics submitted would look like:
{
name: "my_timer"
labels: {"ptransform": "MyTransform"}
int_value: 100
}

The simplicity coming from the fact that there's only one proto format for
the spec and for the value.  The only thing that varies are the entries in
the map and the value field set.  It's pretty easy to establish contracts
around this type of spec and even generate protos for use the in SDK that
make the expectations explicit.


On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw  wrote:

> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:
>
>>
>> Or just "beam:counter::" or even
>> "beam:metric::" since metrics have a type separate from
>> their name.
>>
>
> I proposed keeping the "user" in there to avoid possible clashes with the
> system namespaces. (No preference on counter vs. metric, I wasn't trying to
> imply counter = SumInts)
>
>
> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler  wrote:
>
>> I like the generalization from entity -> labels.  I view the purpose of
>> those fields to provide context.  And labels feel like they supports a
>> richer set of contexts.
>>
>
> If we think such a generalization provides value, I'm fine with doing that
> now, as sets or key-value maps, if we have good enough examples to justify
> this.
>
>
>> The URN concept gets a little tricky.  I totally agree that the context
>> fields should not be embedded in the name.
>> There's a "name" which is the identifier that can be used to communicate
>> what context values are supported / allowed for metrics with that name (for
>> example, element_count expects a ptransform ID).  But then there's the
>> context.  In Stackdriver, this context is a map of key-value pairs; the
>> type is considered metadata associated with the name, but not communicated
>> with the value.
>>
>
> I'm not quite following you here. If context contains a ptransform id,
> then it cannot be associated with a single name.
>
>
>> Could the URN be "beam:namespace:name" and every metric have a map of
>> key-value pairs for context?
>>
>
> The URN is the name. Something like
> "beam:metric:ptransform_execution_times:v1."
>
>
>> Not sure where this fits in the discussion or if this is handled
>> somewhere, but allowing for a metric configuration that's provided
>> independently of the value allows for configuring "type", "units", etc in a
>> uniform way without having to encode them in the metric name / value.
>> Stackdriver expects each metric type has been configured ahead of time with
>> these annotations / metadata.  Then values are reported separately.  For
>> system metrics, the definitions can be packaged with the SDK.  For user
>> metrics, they'd be defined at runtime.
>>
>
> This feels like the metrics spec, that specifies that the metric with
> name/URN X has this type plus a bunch of other metadata (e.g. units, if
> they're not implicit in the type? This gets into whether the type should be
> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
> units metadata).
>
>
>>
>>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw 
>>> wrote:
>>>
 On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles  wrote:

> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw 
> wrote:
>
>> Also, the only use for payloads is because "User Counter" is
>> currently a single URN, rather than using the namespacing characteristics
>> of URNs to map user names onto distinct metric names.
>>
>
> Can they be URNs? I don't see value in having a "user metric" URN
> where you then have to look elsewhere for what the real name is.
>

 Yes, that was my point with the parenthetical statement. I would rather
 have "beam:counter:user:use_provide_namespace:user_provide_name" than use
 the payload field for this. So if we're going to keep the payload field, we
 need more compelling usecases.

>>>
>>> Or just "beam:counter::" or even
>>> "beam:metric::" since metrics have a type separate from
>>> their name.
>>>
>>> Kenn
>>>
>>>
 A payload avoids the messiness of having to pack (and parse) arbitrary
> parameters into a name though.) If we're going to choose names that the
> system and sdks agree to have specific meanings, and to avoid accidental
> collisions, making 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:

>
> Or just "beam:counter::" or even
> "beam:metric::" since metrics have a type separate from
> their name.
>

I proposed keeping the "user" in there to avoid possible clashes with the
system namespaces. (No preference on counter vs. metric, I wasn't trying to
imply counter = SumInts)


On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler  wrote:

> I like the generalization from entity -> labels.  I view the purpose of
> those fields to provide context.  And labels feel like they supports a
> richer set of contexts.
>

If we think such a generalization provides value, I'm fine with doing that
now, as sets or key-value maps, if we have good enough examples to justify
this.


> The URN concept gets a little tricky.  I totally agree that the context
> fields should not be embedded in the name.
> There's a "name" which is the identifier that can be used to communicate
> what context values are supported / allowed for metrics with that name (for
> example, element_count expects a ptransform ID).  But then there's the
> context.  In Stackdriver, this context is a map of key-value pairs; the
> type is considered metadata associated with the name, but not communicated
> with the value.
>

I'm not quite following you here. If context contains a ptransform id, then
it cannot be associated with a single name.


> Could the URN be "beam:namespace:name" and every metric have a map of
> key-value pairs for context?
>

The URN is the name. Something like
"beam:metric:ptransform_execution_times:v1."


> Not sure where this fits in the discussion or if this is handled
> somewhere, but allowing for a metric configuration that's provided
> independently of the value allows for configuring "type", "units", etc in a
> uniform way without having to encode them in the metric name / value.
> Stackdriver expects each metric type has been configured ahead of time with
> these annotations / metadata.  Then values are reported separately.  For
> system metrics, the definitions can be packaged with the SDK.  For user
> metrics, they'd be defined at runtime.
>

This feels like the metrics spec, that specifies that the metric with
name/URN X has this type plus a bunch of other metadata (e.g. units, if
they're not implicit in the type? This gets into whether the type should be
Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
units metadata).


>
>
> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:
>
>>
>>
>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw 
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles  wrote:
>>>
 On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw 
 wrote:

> Also, the only use for payloads is because "User Counter" is currently
> a single URN, rather than using the namespacing characteristics of URNs to
> map user names onto distinct metric names.
>

 Can they be URNs? I don't see value in having a "user metric" URN where
 you then have to look elsewhere for what the real name is.

>>>
>>> Yes, that was my point with the parenthetical statement. I would rather
>>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>>> the payload field for this. So if we're going to keep the payload field, we
>>> need more compelling usecases.
>>>
>>
>> Or just "beam:counter::" or even
>> "beam:metric::" since metrics have a type separate from
>> their name.
>>
>> Kenn
>>
>>
>>> A payload avoids the messiness of having to pack (and parse) arbitrary
 parameters into a name though.) If we're going to choose names that the
 system and sdks agree to have specific meanings, and to avoid accidental
 collisions, making them full-fledged documented URNs has value.

>>>
 Value is the "payload". Likely worth changing the name to avoid
 confusion with the payload above. It's bytes because it depends on the
 type. I would try to avoid nesting it too deeply (e.g. a payload within a
 payload). If we thing the types are generally limited, another option would
 be a oneof field (with a bytes option just in case) for transparency. There
 are pros and cons going this route.

 Type is what I proposed we add, instead of it being implicit in the
 name (and unknowable if one does not recognize the name). This makes things
 more open-ended and easier to evolve and work with.

 Entity could be generalized to Label, or LabelSet if desired. But as
 mentioned I think it makes sense to pull this out as a separate field,
 especially when it makes sense to aggregate a single named counter across
 labels as well as for a single label (e.g. execution time of composite
 transforms).

 - Robert



 On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler 
 wrote:

> Hi 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Andrea Foegler
I like the generalization from entity -> labels.  I view the purpose of
those fields to provide context.  And labels feel like they supports a
richer set of contexts.

The URN concept gets a little tricky.  I totally agree that the context
fields should not be embedded in the name.
There's a "name" which is the identifier that can be used to communicate
what context values are supported / allowed for metrics with that name (for
example, element_count expects a ptransform ID).  But then there's the
context.  In Stackdriver, this context is a map of key-value pairs; the
type is considered metadata associated with the name, but not communicated
with the value.  Could the URN be "beam:namespace:name" and every metric
have a map of key-value pairs for context?

Not sure where this fits in the discussion or if this is handled somewhere,
but allowing for a metric configuration that's provided independently of
the value allows for configuring "type", "units", etc in a uniform way
without having to encode them in the metric name / value.  Stackdriver
expects each metric type has been configured ahead of time with these
annotations / metadata.  Then values are reported separately.  For system
metrics, the definitions can be packaged with the SDK.  For user metrics,
they'd be defined at runtime.



On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles  wrote:

>
>
> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw 
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles  wrote:
>>
>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw 
>>> wrote:
>>>
 Also, the only use for payloads is because "User Counter" is currently
 a single URN, rather than using the namespacing characteristics of URNs to
 map user names onto distinct metric names.

>>>
>>> Can they be URNs? I don't see value in having a "user metric" URN where
>>> you then have to look elsewhere for what the real name is.
>>>
>>
>> Yes, that was my point with the parenthetical statement. I would rather
>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>> the payload field for this. So if we're going to keep the payload field, we
>> need more compelling usecases.
>>
>
> Or just "beam:counter::" or even
> "beam:metric::" since metrics have a type separate from
> their name.
>
> Kenn
>
>
>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>> parameters into a name though.) If we're going to choose names that the
>>> system and sdks agree to have specific meanings, and to avoid accidental
>>> collisions, making them full-fledged documented URNs has value.
>>>
>>
>>> Value is the "payload". Likely worth changing the name to avoid
>>> confusion with the payload above. It's bytes because it depends on the
>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>> payload). If we thing the types are generally limited, another option would
>>> be a oneof field (with a bytes option just in case) for transparency. There
>>> are pros and cons going this route.
>>>
>>> Type is what I proposed we add, instead of it being implicit in the name
>>> (and unknowable if one does not recognize the name). This makes things more
>>> open-ended and easier to evolve and work with.
>>>
>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>> mentioned I think it makes sense to pull this out as a separate field,
>>> especially when it makes sense to aggregate a single named counter across
>>> labels as well as for a single label (e.g. execution time of composite
>>> transforms).
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler 
>>> wrote:
>>>
 Hi folks -

 Before we totally go down the path of highly structured metric protos,
 I'd like to propose considering a simple metrics interface between the SDK
 and the runner.  Something more generic and closer to what most monitoring
 systems would use.

 To use Spark as an example, the Metric system uses a simple metric
 format of name, value and type to report all metrics in a single structure,
 regardless of the source or context of the metric.

 https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html

 The subsystems have contracts for what metrics they will expose and how
 they are calculated:

 https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html

 Codifying the system metrics in the SDK seems perfectly reasonable - no
 reason to make the notion of metric generic at that level.  But at the
 point the metric is leaving the SDK and going to the runner, a simpler,
 generic encoding of the metrics might make it easier to adapt and maintain
 system.  The generic format can include information about downstream
 consumers, if that's 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Kenneth Knowles
On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw  wrote:

> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles  wrote:
>
>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw 
>> wrote:
>>
>>> Also, the only use for payloads is because "User Counter" is currently a
>>> single URN, rather than using the namespacing characteristics of URNs to
>>> map user names onto distinct metric names.
>>>
>>
>> Can they be URNs? I don't see value in having a "user metric" URN where
>> you then have to look elsewhere for what the real name is.
>>
>
> Yes, that was my point with the parenthetical statement. I would rather
> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
> the payload field for this. So if we're going to keep the payload field, we
> need more compelling usecases.
>

Or just "beam:counter::" or even
"beam:metric::" since metrics have a type separate from
their name.

Kenn


> A payload avoids the messiness of having to pack (and parse) arbitrary
>> parameters into a name though.) If we're going to choose names that the
>> system and sdks agree to have specific meanings, and to avoid accidental
>> collisions, making them full-fledged documented URNs has value.
>>
>
>> Value is the "payload". Likely worth changing the name to avoid confusion
>> with the payload above. It's bytes because it depends on the type. I would
>> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
>> thing the types are generally limited, another option would be a oneof
>> field (with a bytes option just in case) for transparency. There are pros
>> and cons going this route.
>>
>> Type is what I proposed we add, instead of it being implicit in the name
>> (and unknowable if one does not recognize the name). This makes things more
>> open-ended and easier to evolve and work with.
>>
>> Entity could be generalized to Label, or LabelSet if desired. But as
>> mentioned I think it makes sense to pull this out as a separate field,
>> especially when it makes sense to aggregate a single named counter across
>> labels as well as for a single label (e.g. execution time of composite
>> transforms).
>>
>> - Robert
>>
>>
>>
>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler 
>> wrote:
>>
>>> Hi folks -
>>>
>>> Before we totally go down the path of highly structured metric protos,
>>> I'd like to propose considering a simple metrics interface between the SDK
>>> and the runner.  Something more generic and closer to what most monitoring
>>> systems would use.
>>>
>>> To use Spark as an example, the Metric system uses a simple metric
>>> format of name, value and type to report all metrics in a single structure,
>>> regardless of the source or context of the metric.
>>>
>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>
>>> The subsystems have contracts for what metrics they will expose and how
>>> they are calculated:
>>>
>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>
>>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>>> reason to make the notion of metric generic at that level.  But at the
>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>> generic encoding of the metrics might make it easier to adapt and maintain
>>> system.  The generic format can include information about downstream
>>> consumers, if that's useful.
>>>
>>> Spark supports a number of Metric Sinks - external monitoring systems.
>>> If runners receive a simple list of metrics, implementing any number of
>>> Sinks for Beam would be straightforward and would generally be a one time
>>> implementation.  If instead all system metrics are sent embedded in a
>>> highly structured, semantically meaningful structure, runner code would
>>> need to be updated to support exporting the new metric. We seem to be
>>> heading in the direction of "if you don't understand this metric, you can't
>>> use it / export it".  But most systems seem to assume metrics are really
>>> simple named values that can be handled a priori.
>>>
>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>> possibly be the sort of simple named values as they are in most monitoring
>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>> add meaning and structure, but simplifying that out before leaving SDK
>>> code.  Is the coupling to a semantically meaningful structure between the
>>> SDK and runner and necessary complexity?
>>>
>>> Andrea
>>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw 
>>> wrote:
>>>
 On Fri, Apr 13, 2018 at 10:10 AM Alex Amato  wrote:

>
> *Thank you for this clarification. I think the table of files fits
> into the 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles  wrote:

> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw 
> wrote:
>
>> Also, the only use for payloads is because "User Counter" is currently a
>> single URN, rather than using the namespacing characteristics of URNs to
>> map user names onto distinct metric names.
>>
>
> Can they be URNs? I don't see value in having a "user metric" URN where
> you then have to look elsewhere for what the real name is.
>

Yes, that was my point with the parenthetical statement. I would rather
have "beam:counter:user:use_provide_namespace:user_provide_name" than use
the payload field for this. So if we're going to keep the payload field, we
need more compelling usecases.


> A payload avoids the messiness of having to pack (and parse) arbitrary
> parameters into a name though.) If we're going to choose names that the
> system and sdks agree to have specific meanings, and to avoid accidental
> collisions, making them full-fledged documented URNs has value.
>

> Value is the "payload". Likely worth changing the name to avoid confusion
> with the payload above. It's bytes because it depends on the type. I would
> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
> thing the types are generally limited, another option would be a oneof
> field (with a bytes option just in case) for transparency. There are pros
> and cons going this route.
>
> Type is what I proposed we add, instead of it being implicit in the name
> (and unknowable if one does not recognize the name). This makes things more
> open-ended and easier to evolve and work with.
>
> Entity could be generalized to Label, or LabelSet if desired. But as
> mentioned I think it makes sense to pull this out as a separate field,
> especially when it makes sense to aggregate a single named counter across
> labels as well as for a single label (e.g. execution time of composite
> transforms).
>
> - Robert
>
>
>
> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler 
> wrote:
>
>> Hi folks -
>>
>> Before we totally go down the path of highly structured metric protos,
>> I'd like to propose considering a simple metrics interface between the SDK
>> and the runner.  Something more generic and closer to what most monitoring
>> systems would use.
>>
>> To use Spark as an example, the Metric system uses a simple metric format
>> of name, value and type to report all metrics in a single structure,
>> regardless of the source or context of the metric.
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>
>> The subsystems have contracts for what metrics they will expose and how
>> they are calculated:
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>
>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>> reason to make the notion of metric generic at that level.  But at the
>> point the metric is leaving the SDK and going to the runner, a simpler,
>> generic encoding of the metrics might make it easier to adapt and maintain
>> system.  The generic format can include information about downstream
>> consumers, if that's useful.
>>
>> Spark supports a number of Metric Sinks - external monitoring systems.
>> If runners receive a simple list of metrics, implementing any number of
>> Sinks for Beam would be straightforward and would generally be a one time
>> implementation.  If instead all system metrics are sent embedded in a
>> highly structured, semantically meaningful structure, runner code would
>> need to be updated to support exporting the new metric. We seem to be
>> heading in the direction of "if you don't understand this metric, you can't
>> use it / export it".  But most systems seem to assume metrics are really
>> simple named values that can be handled a priori.
>>
>> So I guess my primary question is:  Is it necessary for Beam to treat
>> metrics as highly semantic, arbitrarily complex data?  Or could they
>> possibly be the sort of simple named values as they are in most monitoring
>> systems and in Spark?  With the SDK potentially providing scaffolding to
>> add meaning and structure, but simplifying that out before leaving SDK
>> code.  Is the coupling to a semantically meaningful structure between the
>> SDK and runner and necessary complexity?
>>
>> Andrea
>>
>>
>>
>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw 
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato  wrote:
>>>

 *Thank you for this clarification. I think the table of files fits into
 the model as one of type string-set (with union as aggregation). *
 Its not a list of files, its a list of metadata for each file, several
 pieces of data per file.

 Are you proposing that there would be separate URNs as well for each
 entity being measured then, so the the URN 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Kenneth Knowles
On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw  wrote:

> Also, the only use for payloads is because "User Counter" is currently a
> single URN, rather than using the namespacing characteristics of URNs to
> map user names onto distinct metric names.
>

Can they be URNs? I don't see value in having a "user metric" URN where you
then have to look elsewhere for what the real name is.

Kenn


A payload avoids the messiness of having to pack (and parse) arbitrary
> parameters into a name though.) If we're going to choose names that the
> system and sdks agree to have specific meanings, and to avoid accidental
> collisions, making them full-fledged documented URNs has value.
>
> Value is the "payload". Likely worth changing the name to avoid confusion
> with the payload above. It's bytes because it depends on the type. I would
> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
> thing the types are generally limited, another option would be a oneof
> field (with a bytes option just in case) for transparency. There are pros
> and cons going this route.
>
> Type is what I proposed we add, instead of it being implicit in the name
> (and unknowable if one does not recognize the name). This makes things more
> open-ended and easier to evolve and work with.
>
> Entity could be generalized to Label, or LabelSet if desired. But as
> mentioned I think it makes sense to pull this out as a separate field,
> especially when it makes sense to aggregate a single named counter across
> labels as well as for a single label (e.g. execution time of composite
> transforms).
>
> - Robert
>
>
>
> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler 
> wrote:
>
>> Hi folks -
>>
>> Before we totally go down the path of highly structured metric protos,
>> I'd like to propose considering a simple metrics interface between the SDK
>> and the runner.  Something more generic and closer to what most monitoring
>> systems would use.
>>
>> To use Spark as an example, the Metric system uses a simple metric format
>> of name, value and type to report all metrics in a single structure,
>> regardless of the source or context of the metric.
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>
>> The subsystems have contracts for what metrics they will expose and how
>> they are calculated:
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>
>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>> reason to make the notion of metric generic at that level.  But at the
>> point the metric is leaving the SDK and going to the runner, a simpler,
>> generic encoding of the metrics might make it easier to adapt and maintain
>> system.  The generic format can include information about downstream
>> consumers, if that's useful.
>>
>> Spark supports a number of Metric Sinks - external monitoring systems.
>> If runners receive a simple list of metrics, implementing any number of
>> Sinks for Beam would be straightforward and would generally be a one time
>> implementation.  If instead all system metrics are sent embedded in a
>> highly structured, semantically meaningful structure, runner code would
>> need to be updated to support exporting the new metric. We seem to be
>> heading in the direction of "if you don't understand this metric, you can't
>> use it / export it".  But most systems seem to assume metrics are really
>> simple named values that can be handled a priori.
>>
>> So I guess my primary question is:  Is it necessary for Beam to treat
>> metrics as highly semantic, arbitrarily complex data?  Or could they
>> possibly be the sort of simple named values as they are in most monitoring
>> systems and in Spark?  With the SDK potentially providing scaffolding to
>> add meaning and structure, but simplifying that out before leaving SDK
>> code.  Is the coupling to a semantically meaningful structure between the
>> SDK and runner and necessary complexity?
>>
>> Andrea
>>
>>
>>
>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw 
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato  wrote:
>>>

 *Thank you for this clarification. I think the table of files fits into
 the model as one of type string-set (with union as aggregation). *
 Its not a list of files, its a list of metadata for each file, several
 pieces of data per file.

 Are you proposing that there would be separate URNs as well for each
 entity being measured then, so the the URN defines the type of entity being
 measured.
 "urn.beam.metrics.PCollectionByteCount" is a URN for always for
 PCollection entities
 "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
 PTransform entities

>>>
>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>> execution times are 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
+1 to keeping things simple, both in code and the model to understand.

I like thinking of things as (name, value, type) triples. Historically,
we've packed the entity name (e.g. PTransform name) into the string name
field and parsed it out in various places; I think it's worth pulling this
out and making it explicit instead, so metrics would be (name, entity,
value, type) tuples. In the current proposal:

Name is the URN + a possible bytes payload. (Actually, it's a bit unclear
if there's any relationship between counters with the same name and
different payloads. Also, the only use for payloads is because "User
Counter" is currently a single URN, rather than using the namespacing
characteristics of URNs to map user names onto distinct metric names. A
payload avoids the messiness of having to pack (and parse) arbitrary
parameters into a name though.) If we're going to choose names that the
system and sdks agree to have specific meanings, and to avoid accidental
collisions, making them full-fledged documented URNs has value.

Value is the "payload". Likely worth changing the name to avoid confusion
with the payload above. It's bytes because it depends on the type. I would
try to avoid nesting it too deeply (e.g. a payload within a payload). If we
thing the types are generally limited, another option would be a oneof
field (with a bytes option just in case) for transparency. There are pros
and cons going this route.

Type is what I proposed we add, instead of it being implicit in the name
(and unknowable if one does not recognize the name). This makes things more
open-ended and easier to evolve and work with.

Entity could be generalized to Label, or LabelSet if desired. But as
mentioned I think it makes sense to pull this out as a separate field,
especially when it makes sense to aggregate a single named counter across
labels as well as for a single label (e.g. execution time of composite
transforms).

- Robert



On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler  wrote:

> Hi folks -
>
> Before we totally go down the path of highly structured metric protos, I'd
> like to propose considering a simple metrics interface between the SDK and
> the runner.  Something more generic and closer to what most monitoring
> systems would use.
>
> To use Spark as an example, the Metric system uses a simple metric format
> of name, value and type to report all metrics in a single structure,
> regardless of the source or context of the metric.
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>
> The subsystems have contracts for what metrics they will expose and how
> they are calculated:
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>
> Codifying the system metrics in the SDK seems perfectly reasonable - no
> reason to make the notion of metric generic at that level.  But at the
> point the metric is leaving the SDK and going to the runner, a simpler,
> generic encoding of the metrics might make it easier to adapt and maintain
> system.  The generic format can include information about downstream
> consumers, if that's useful.
>
> Spark supports a number of Metric Sinks - external monitoring systems.  If
> runners receive a simple list of metrics, implementing any number of Sinks
> for Beam would be straightforward and would generally be a one time
> implementation.  If instead all system metrics are sent embedded in a
> highly structured, semantically meaningful structure, runner code would
> need to be updated to support exporting the new metric. We seem to be
> heading in the direction of "if you don't understand this metric, you can't
> use it / export it".  But most systems seem to assume metrics are really
> simple named values that can be handled a priori.
>
> So I guess my primary question is:  Is it necessary for Beam to treat
> metrics as highly semantic, arbitrarily complex data?  Or could they
> possibly be the sort of simple named values as they are in most monitoring
> systems and in Spark?  With the SDK potentially providing scaffolding to
> add meaning and structure, but simplifying that out before leaving SDK
> code.  Is the coupling to a semantically meaningful structure between the
> SDK and runner and necessary complexity?
>
> Andrea
>
>
>
> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw 
> wrote:
>
>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato  wrote:
>>
>>>
>>> *Thank you for this clarification. I think the table of files fits into
>>> the model as one of type string-set (with union as aggregation). *
>>> Its not a list of files, its a list of metadata for each file, several
>>> pieces of data per file.
>>>
>>> Are you proposing that there would be separate URNs as well for each
>>> entity being measured then, so the the URN defines the type of entity being
>>> measured.
>>> "urn.beam.metrics.PCollectionByteCount" is a URN for 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 10:10 AM Alex Amato  wrote:

>
> *Thank you for this clarification. I think the table of files fits into
> the model as one of type string-set (with union as aggregation). *
> Its not a list of files, its a list of metadata for each file, several
> pieces of data per file.
>
> Are you proposing that there would be separate URNs as well for each
> entity being measured then, so the the URN defines the type of entity being
> measured.
> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
> PCollection entities
> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
> PTransform entities
>

Yes. FWIW, it may not even be needed to put this in the name, e.g.
execution times are never for PCollections, and even if they were it'd be
semantically a very different beast (which should not re-use the same URN).

*message MetricSpec {*
> *  // (Required) A URN that describes the accompanying payload.*
> *  // For any URN that is not recognized (by whomever is inspecting*
> *  // it) the parameter payload should be treated as opaque and*
> *  // passed as-is.*
> *  string urn = 1;*
>
> *  // (Optional) The data specifying any parameters to the URN. If*
> *  // the URN does not require any arguments, this may be omitted.*
> *  bytes parameters_payload = 2;*
>
> *  // (Required) A URN that describes the type of values this metric*
> *  // records (e.g. durations that should be summed).*
> *}*
>
> *message Metric[Values] {*
> * // (Required) The original requesting MetricSpec.*
> * MetricSpec metric_spec = 1;*
>
> * // A mapping of entities to (encoded) values.*
> * map values;*
> This ignores the non-unqiueness of entity identifiers. This is why in my
> doc, I have specified the entity type and its string identifier
> @Ken, I believe you have pointed this out in the past, that uniqueness is
> only guaranteed within a type of entity (all PCollections), but not between
> entities (A Pcollection and PTransform may have the same identifier).
>

See above for why this is not an issue. The extra complexity (in protos and
code), the inability to use them as map keys, and the fact that they'll be
100% redundant for all entities for a given metric convinces me that it's
not worth creating and tracking an enum for the type alongside the id.


> *}*
>
> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw 
> wrote:
>
>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles  wrote:
>>
>>>
>>> To Robert's proto:
>>>
>>>  // A mapping of entities to (encoded) values.
  map values;

>>>
>>> Are the keys here the names of the metrics, aka what is used for URNs in
>>> the doc?
>>>

>> They're the entities to which a metric is attached, e.g. a PTransform, a
>> PCollection, or perhaps a process/worker.
>>
>>
>>> }

 On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles  wrote:
>
>> Agree with all of this. It echoes a thread on the doc that I was
>> going to bring here. Let's keep it simple and use concrete use cases to
>> drive additional abstraction if/when it becomes compelling.
>>
>> Kenn
>>
>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers 
>> wrote:
>>
>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>> supported type" didn't include new ways of aggregating ints. As long as
>>> that means we have a fixed set of aggregations (that align with what 
>>> what
>>> users want and metrics back end support) it seems like we are doing user
>>> metrics right.
>>>
>>> - Ben
>>>
>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>> rmannibu...@gmail.com> wrote:
>>>
 Maybe leave it out until proven it is needed. ATM counters are used
 a lot but others are less mainstream so being too fine from the start 
 can
 just add complexity and bugs in impls IMHO.

 Le 12 avr. 2018 08:06, "Robert Bradshaw"  a
 écrit :

> By "type" of metric, I mean both the data types (including their
> encoding) and accumulator strategy. So sumint would be a type, as 
> would
> double-distribution.
>
> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
> bjchamb...@gmail.com> wrote:
>
>> When you say type do you mean accumulator type, result type, or
>> accumulator strategy? Specifically, what is the "type" of sumint, 
>> sumlong,
>> meanlong, etc?
>>
>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>>
>>> Fully custom metric types is the "more speculative and
>>> difficult" feature that I was proposing we kick down the road (and 
>>> may
>>> never get to). What I'm suggesting is that we support custom 
>>> 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-13 Thread Robert Bradshaw
On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles  wrote:

>
> To Robert's proto:
>
>  // A mapping of entities to (encoded) values.
>>  map values;
>>
>
> Are the keys here the names of the metrics, aka what is used for URNs in
> the doc?
>
>>
They're the entities to which a metric is attached, e.g. a PTransform, a
PCollection, or perhaps a process/worker.


> }
>>
>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles  wrote:
>>>
 Agree with all of this. It echoes a thread on the doc that I was going
 to bring here. Let's keep it simple and use concrete use cases to drive
 additional abstraction if/when it becomes compelling.

 Kenn

 On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers 
 wrote:

> Sounds perfect. Just wanted to make sure that "custom metrics of
> supported type" didn't include new ways of aggregating ints. As long as
> that means we have a fixed set of aggregations (that align with what what
> users want and metrics back end support) it seems like we are doing user
> metrics right.
>
> - Ben
>
> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
> rmannibu...@gmail.com> wrote:
>
>> Maybe leave it out until proven it is needed. ATM counters are used a
>> lot but others are less mainstream so being too fine from the start can
>> just add complexity and bugs in impls IMHO.
>>
>> Le 12 avr. 2018 08:06, "Robert Bradshaw"  a
>> écrit :
>>
>>> By "type" of metric, I mean both the data types (including their
>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>> double-distribution.
>>>
>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers 
>>> wrote:
>>>
 When you say type do you mean accumulator type, result type, or
 accumulator strategy? Specifically, what is the "type" of sumint, 
 sumlong,
 meanlong, etc?

 On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw 
 wrote:

> Fully custom metric types is the "more speculative and difficult"
> feature that I was proposing we kick down the road (and may never get 
> to).
> What I'm suggesting is that we support custom metrics of standard 
> type.
>
> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers 
> wrote:
>
>> The metric api is designed to prevent user defined metric types
>> based on the fact they just weren't used enough to justify support.
>>
>> Is there a reason we are bringing that complexity back? Shouldn't
>> we just need the ability for the standard set plus any special system
>> metrivs?
>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>> rober...@google.com> wrote:
>>
>>> Thanks. I think this has simplified things.
>>>
>>> One thing that has occurred to me is that we're conflating the
>>> idea of custom metrics and custom metric types. I would propose
>>> the MetricSpec field be augmented with an additional field "type" 
>>> which is
>>> a urn specifying the type of metric it is (i.e. the contents of its
>>> payload, as well as the form of aggregation). Summing or maxing 
>>> over ints
>>> would be a typical example. Though we could pursue making this 
>>> opaque to
>>> the runner in the long run, that's a more speculative (and 
>>> difficult)
>>> feature to tackle. This would allow the runner to at least 
>>> aggregate and
>>> report/return to the SDK metrics that it did not itself understand 
>>> the
>>> semantic meaning of. (It would probably simplify much of the 
>>> specialization
>>> in the runner itself for metrics that it *did* understand as well.)
>>>
>>> In addition, rather than having UserMetricOfTypeX for every type
>>> X one would have a single URN for UserMetric and it spec would 
>>> designate
>>> the type and payload designate the (qualified) name.
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato 
>>> wrote:
>>>
 Thank you everyone for your feedback so far.
 I have made a revision today which is to make all metrics refer
 to a primary entity, so I have restructured some of the protos a 
 little bit.

 The point of this change was to futureproof the possibility of
 allowing custom user metrics, with custom aggregation functions 
 for its
 metric updates.
 Now that each metric has an aggregation_entity 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Robert Bradshaw
On Thu, Apr 12, 2018 at 8:17 PM Alex Amato  wrote:

> I agree that there is some confusion about concepts. Here are several
> concepts which have come up in discussions, as I see them (not official
> names).
>
> *Metric*
>
>- For the purposes of my document, I have been referring to a Metric
>as any sort of information the SDK can send to the Runner
>   - This does not mean only quantitative, aggregated values.
>   - This can include other useful '*monitoring information*', for
>   supporting debugging/monitoring scenarios such as
>  - A table of files which are not yet finished reading, causing a
>  streaming pipeline to be blocked
>   - It has been pointed out to me, that when many people hear metric,
>a very specific thing comes to mind, in particular quantitative,
>aggregated values. *That is NOT what my document is limited to. I
>consider both that type of metric, and more arbitrary 'monitoring
>information', like a table of files with statuses in the proposal.*
>- Perhaps there should be another word for this concept, yet I have
>not yet come up with a good one, "monitoring information", "monitoring
>item" perhaps.
>
>
> *Metric types/Metric classes*
>
>- A collection of information reported on
>ProcessBundleProgressResponse and ProcessBundleResponse from the SDK to the
>RunnerHarness.
>   - e.g. execution time of par do functions.
>- In my proposal they are defined by a URN and two structs which are
>serialized into a MetricSpec and Metric bytes payload field, for requesting
>and responding to the metrics.
>   - e.g. beam:metric:ptransform_execution_times:v1 defines the
>   information needed to describe how a ptransform
>- All metrics which are passed across the FN API have a *metric type*
>
>
> *User metrics*
>
>- A metric added by a pipeline writer, using an SDK API to create
>these.
>- In my proposal the various *UserMetric types are a Metric Type. *
>   - e.g. “urn:beam:metric:user_distribution_data:v1” and 
> “urn:beam:metric:user_counter_data:v1”
>   define two metric types for packaging these user metrics and
>   communicating them across the FN API.
>   - SDK writers would need to write code to package the user metrics
>   from SDK API calls into their associated metric types to send them 
> across
>   the FN API.
>
> *Custom metric types*
>
>- A metric type which is not included in a catalog of first class beam
>metrics. This can be thought of as metrics a custom engine+runner+sdk
>(system as a whole) collects which is not part of the beam model.
>   - e.g. a closed source runner can define its own URNs and metrics,
>   extending the beam model
>  - for example an I/O source specific to a closed source
>  engine+runner+sdk may export a table of files it is reading with 
> statuses
>  as a custom metric type
>
>
> *Custom User Metrics with Custom Metric Types *
>
>- Not proposed to support by the doc
>- A user specified metric, written by a pipeline writer with a custom
>metric type, likely would be implemented using a general mechanism to
>attach the custom metric.
>- May have a custom user specified aggregation function as well.
>
>
> *Reporting metrics to external systems such as drop wizard*
>
>- My doc does not specifically cover this, it assumes that a runner
>harness would be responsible for reporting metrics in formats specific to
>those external systems, such as Drop Wizard. It assumes that the
>URNs+Metric types provided will be specified enough so that it would be
>possible to make such a translation.
>- Each metric type would need to be handled in the RunnerHarness, to
>collect and report the metric to an external system
>- Some concern has come up about this, and if this should dictate the
>format of the metrics which the SDK sends to the RunnerHarness of the FN
>API, rather than using the more custom URN+payload approach.
>- Though there could be URNs specifically designed to do this, the
>   intention of the design in the doc is to not require SDKs to give string
>   "names" to metrics, just to fill in URN payloads, and the Runner Harness
>   will pick names for metrics if needed to send to external systems.
>
> Just wanted to clarify this a bit. I hope the example of the table of
> files being a more complex metric type describes the usage of custom metric
> types. I'll update the doc with this
>

Thank you for this clarification. I think the table of files fits into the
model as one of type string-set (with union as aggregation).


> @Robert, I am not sure if you are proposing anything that is not in the
> current form of the doc.
>

Yes, I am.

Currently, the URN of the metric spec specifies both (1) the semantic
meaning of this metric (i.e. what exactly is being instrumented, whether
that be 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Alex Amato
I agree that there is some confusion about concepts. Here are several
concepts which have come up in discussions, as I see them (not official
names).

*Metric*

   - For the purposes of my document, I have been referring to a Metric as
   any sort of information the SDK can send to the Runner
  - This does not mean only quantitative, aggregated values.
  - This can include other useful '*monitoring information*', for
  supporting debugging/monitoring scenarios such as
 - A table of files which are not yet finished reading, causing a
 streaming pipeline to be blocked
  - It has been pointed out to me, that when many people hear metric, a
   very specific thing comes to mind, in particular quantitative,
   aggregated values. *That is NOT what my document is limited to. I
   consider both that type of metric, and more arbitrary 'monitoring
   information', like a table of files with statuses in the proposal.*
   - Perhaps there should be another word for this concept, yet I have not
   yet come up with a good one, "monitoring information", "monitoring item"
   perhaps.


*Metric types/Metric classes*

   - A collection of information reported on ProcessBundleProgressResponse
   and ProcessBundleResponse from the SDK to the RunnerHarness.
  - e.g. execution time of par do functions.
   - In my proposal they are defined by a URN and two structs which are
   serialized into a MetricSpec and Metric bytes payload field, for requesting
   and responding to the metrics.
  - e.g. beam:metric:ptransform_execution_times:v1 defines the
  information needed to describe how a ptransform
   - All metrics which are passed across the FN API have a *metric type*


*User metrics*

   - A metric added by a pipeline writer, using an SDK API to create these.
   - In my proposal the various *UserMetric types are a Metric Type. *
  - e.g. “urn:beam:metric:user_distribution_data:v1” and
“urn:beam:metric:user_counter_data:v1”
  define two metric types for packaging these user metrics and
  communicating them across the FN API.
  - SDK writers would need to write code to package the user metrics
  from SDK API calls into their associated metric types to send them across
  the FN API.

*Custom metric types*

   - A metric type which is not included in a catalog of first class beam
   metrics. This can be thought of as metrics a custom engine+runner+sdk
   (system as a whole) collects which is not part of the beam model.
  - e.g. a closed source runner can define its own URNs and metrics,
  extending the beam model
 - for example an I/O source specific to a closed source
 engine+runner+sdk may export a table of files it is reading
with statuses
 as a custom metric type


*Custom User Metrics with Custom Metric Types *

   - Not proposed to support by the doc
   - A user specified metric, written by a pipeline writer with a custom
   metric type, likely would be implemented using a general mechanism to
   attach the custom metric.
   - May have a custom user specified aggregation function as well.


*Reporting metrics to external systems such as drop wizard*

   - My doc does not specifically cover this, it assumes that a runner
   harness would be responsible for reporting metrics in formats specific to
   those external systems, such as Drop Wizard. It assumes that the
   URNs+Metric types provided will be specified enough so that it would be
   possible to make such a translation.
   - Each metric type would need to be handled in the RunnerHarness, to
   collect and report the metric to an external system
   - Some concern has come up about this, and if this should dictate the
   format of the metrics which the SDK sends to the RunnerHarness of the FN
   API, rather than using the more custom URN+payload approach.
   - Though there could be URNs specifically designed to do this, the
  intention of the design in the doc is to not require SDKs to give string
  "names" to metrics, just to fill in URN payloads, and the Runner Harness
  will pick names for metrics if needed to send to external systems.

Just wanted to clarify this a bit. I hope the example of the table of files
being a more complex metric type describes the usage of custom metric
types. I'll update the doc with this

@Robert, I am not sure if you are proposing anything that is not in the
current form of the doc.

On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles  wrote:

> Agree with all of this. It echoes a thread on the doc that I was going to
> bring here. Let's keep it simple and use concrete use cases to drive
> additional abstraction if/when it becomes compelling.
>
> Kenn
>
> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers  wrote:
>
>> Sounds perfect. Just wanted to make sure that "custom metrics of
>> supported type" didn't include new ways of aggregating ints. As long as
>> that means we have a fixed set of aggregations 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Kenneth Knowles
Agree with all of this. It echoes a thread on the doc that I was going to
bring here. Let's keep it simple and use concrete use cases to drive
additional abstraction if/when it becomes compelling.

Kenn

On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers  wrote:

> Sounds perfect. Just wanted to make sure that "custom metrics of supported
> type" didn't include new ways of aggregating ints. As long as that means we
> have a fixed set of aggregations (that align with what what users want and
> metrics back end support) it seems like we are doing user metrics right.
>
> - Ben
>
> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau 
> wrote:
>
>> Maybe leave it out until proven it is needed. ATM counters are used a lot
>> but others are less mainstream so being too fine from the start can just
>> add complexity and bugs in impls IMHO.
>>
>> Le 12 avr. 2018 08:06, "Robert Bradshaw"  a écrit :
>>
>>> By "type" of metric, I mean both the data types (including their
>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>> double-distribution.
>>>
>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers 
>>> wrote:
>>>
 When you say type do you mean accumulator type, result type, or
 accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
 meanlong, etc?

 On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw 
 wrote:

> Fully custom metric types is the "more speculative and difficult"
> feature that I was proposing we kick down the road (and may never get to).
> What I'm suggesting is that we support custom metrics of standard type.
>
> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers 
> wrote:
>
>> The metric api is designed to prevent user defined metric types based
>> on the fact they just weren't used enough to justify support.
>>
>> Is there a reason we are bringing that complexity back? Shouldn't we
>> just need the ability for the standard set plus any special system 
>> metrivs?
>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw 
>> wrote:
>>
>>> Thanks. I think this has simplified things.
>>>
>>> One thing that has occurred to me is that we're conflating the idea
>>> of custom metrics and custom metric types. I would propose the 
>>> MetricSpec
>>> field be augmented with an additional field "type" which is a urn
>>> specifying the type of metric it is (i.e. the contents of its payload, 
>>> as
>>> well as the form of aggregation). Summing or maxing over ints would be a
>>> typical example. Though we could pursue making this opaque to the 
>>> runner in
>>> the long run, that's a more speculative (and difficult) feature to 
>>> tackle.
>>> This would allow the runner to at least aggregate and report/return to 
>>> the
>>> SDK metrics that it did not itself understand the semantic meaning of. 
>>> (It
>>> would probably simplify much of the specialization in the runner itself 
>>> for
>>> metrics that it *did* understand as well.)
>>>
>>> In addition, rather than having UserMetricOfTypeX for every type X
>>> one would have a single URN for UserMetric and it spec would designate 
>>> the
>>> type and payload designate the (qualified) name.
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato 
>>> wrote:
>>>
 Thank you everyone for your feedback so far.
 I have made a revision today which is to make all metrics refer to
 a primary entity, so I have restructured some of the protos a little 
 bit.

 The point of this change was to futureproof the possibility of
 allowing custom user metrics, with custom aggregation functions for its
 metric updates.
 Now that each metric has an aggregation_entity associated with it
 (e.g. PCollection, PTransform), we can design an approach which 
 forwards
 the opaque bytes metric updates, without deserializing them. These are
 forwarded to user provided code which then would deserialize the metric
 update payloads and perform the custom aggregations.

 I think it has also simplified some of the URN metric protos, as
 they do not need to keep track of ptransform names inside themselves 
 now.
 The result is simpler structures, for the metrics as the entities are
 pulled outside of the metric.

 I have mentioned this in the doc now, and wanted to draw attention
 to this particular revision.



 On Tue, Apr 10, 2018 at 9:53 AM Alex Amato 
 wrote:

> I've gathered a lot of feedback so far and want 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Ben Chambers
Sounds perfect. Just wanted to make sure that "custom metrics of supported
type" didn't include new ways of aggregating ints. As long as that means we
have a fixed set of aggregations (that align with what what users want and
metrics back end support) it seems like we are doing user metrics right.

- Ben

On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau 
wrote:

> Maybe leave it out until proven it is needed. ATM counters are used a lot
> but others are less mainstream so being too fine from the start can just
> add complexity and bugs in impls IMHO.
>
> Le 12 avr. 2018 08:06, "Robert Bradshaw"  a écrit :
>
>> By "type" of metric, I mean both the data types (including their
>> encoding) and accumulator strategy. So sumint would be a type, as would
>> double-distribution.
>>
>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers 
>> wrote:
>>
>>> When you say type do you mean accumulator type, result type, or
>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>> meanlong, etc?
>>>
>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw 
>>> wrote:
>>>
 Fully custom metric types is the "more speculative and difficult"
 feature that I was proposing we kick down the road (and may never get to).
 What I'm suggesting is that we support custom metrics of standard type.

 On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers 
 wrote:

> The metric api is designed to prevent user defined metric types based
> on the fact they just weren't used enough to justify support.
>
> Is there a reason we are bringing that complexity back? Shouldn't we
> just need the ability for the standard set plus any special system 
> metrivs?
> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw 
> wrote:
>
>> Thanks. I think this has simplified things.
>>
>> One thing that has occurred to me is that we're conflating the idea
>> of custom metrics and custom metric types. I would propose the MetricSpec
>> field be augmented with an additional field "type" which is a urn
>> specifying the type of metric it is (i.e. the contents of its payload, as
>> well as the form of aggregation). Summing or maxing over ints would be a
>> typical example. Though we could pursue making this opaque to the runner 
>> in
>> the long run, that's a more speculative (and difficult) feature to 
>> tackle.
>> This would allow the runner to at least aggregate and report/return to 
>> the
>> SDK metrics that it did not itself understand the semantic meaning of. 
>> (It
>> would probably simplify much of the specialization in the runner itself 
>> for
>> metrics that it *did* understand as well.)
>>
>> In addition, rather than having UserMetricOfTypeX for every type X
>> one would have a single URN for UserMetric and it spec would designate 
>> the
>> type and payload designate the (qualified) name.
>>
>> - Robert
>>
>>
>>
>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato 
>> wrote:
>>
>>> Thank you everyone for your feedback so far.
>>> I have made a revision today which is to make all metrics refer to a
>>> primary entity, so I have restructured some of the protos a little bit.
>>>
>>> The point of this change was to futureproof the possibility of
>>> allowing custom user metrics, with custom aggregation functions for its
>>> metric updates.
>>> Now that each metric has an aggregation_entity associated with it
>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>> the opaque bytes metric updates, without deserializing them. These are
>>> forwarded to user provided code which then would deserialize the metric
>>> update payloads and perform the custom aggregations.
>>>
>>> I think it has also simplified some of the URN metric protos, as
>>> they do not need to keep track of ptransform names inside themselves 
>>> now.
>>> The result is simpler structures, for the metrics as the entities are
>>> pulled outside of the metric.
>>>
>>> I have mentioned this in the doc now, and wanted to draw attention
>>> to this particular revision.
>>>
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato 
>>> wrote:
>>>
 I've gathered a lot of feedback so far and want to make a decision
 by Friday, and begin working on related PRs next week.

 Please make sure that you provide your feedback before then and I
 will post the final decisions made to this thread Friday afternoon.


 On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía 
 wrote:

> Nice, I created a short link so people can refer to it easily 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Romain Manni-Bucau
Maybe leave it out until proven it is needed. ATM counters are used a lot
but others are less mainstream so being too fine from the start can just
add complexity and bugs in impls IMHO.

Le 12 avr. 2018 08:06, "Robert Bradshaw"  a écrit :

> By "type" of metric, I mean both the data types (including their encoding)
> and accumulator strategy. So sumint would be a type, as would
> double-distribution.
>
> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers 
> wrote:
>
>> When you say type do you mean accumulator type, result type, or
>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>> meanlong, etc?
>>
>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw 
>> wrote:
>>
>>> Fully custom metric types is the "more speculative and difficult"
>>> feature that I was proposing we kick down the road (and may never get to).
>>> What I'm suggesting is that we support custom metrics of standard type.
>>>
>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers 
>>> wrote:
>>>
 The metric api is designed to prevent user defined metric types based
 on the fact they just weren't used enough to justify support.

 Is there a reason we are bringing that complexity back? Shouldn't we
 just need the ability for the standard set plus any special system metrivs?
 On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw 
 wrote:

> Thanks. I think this has simplified things.
>
> One thing that has occurred to me is that we're conflating the idea of
> custom metrics and custom metric types. I would propose the MetricSpec
> field be augmented with an additional field "type" which is a urn
> specifying the type of metric it is (i.e. the contents of its payload, as
> well as the form of aggregation). Summing or maxing over ints would be a
> typical example. Though we could pursue making this opaque to the runner 
> in
> the long run, that's a more speculative (and difficult) feature to tackle.
> This would allow the runner to at least aggregate and report/return to the
> SDK metrics that it did not itself understand the semantic meaning of. (It
> would probably simplify much of the specialization in the runner itself 
> for
> metrics that it *did* understand as well.)
>
> In addition, rather than having UserMetricOfTypeX for every type X one
> would have a single URN for UserMetric and it spec would designate the 
> type
> and payload designate the (qualified) name.
>
> - Robert
>
>
>
> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato  wrote:
>
>> Thank you everyone for your feedback so far.
>> I have made a revision today which is to make all metrics refer to a
>> primary entity, so I have restructured some of the protos a little bit.
>>
>> The point of this change was to futureproof the possibility of
>> allowing custom user metrics, with custom aggregation functions for its
>> metric updates.
>> Now that each metric has an aggregation_entity associated with it
>> (e.g. PCollection, PTransform), we can design an approach which forwards
>> the opaque bytes metric updates, without deserializing them. These are
>> forwarded to user provided code which then would deserialize the metric
>> update payloads and perform the custom aggregations.
>>
>> I think it has also simplified some of the URN metric protos, as they
>> do not need to keep track of ptransform names inside themselves now. The
>> result is simpler structures, for the metrics as the entities are pulled
>> outside of the metric.
>>
>> I have mentioned this in the doc now, and wanted to draw attention to
>> this particular revision.
>>
>>
>>
>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato 
>> wrote:
>>
>>> I've gathered a lot of feedback so far and want to make a decision
>>> by Friday, and begin working on related PRs next week.
>>>
>>> Please make sure that you provide your feedback before then and I
>>> will post the final decisions made to this thread Friday afternoon.
>>>
>>>
>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía 
>>> wrote:
>>>
 Nice, I created a short link so people can refer to it easily in
 future discussions, website, etc.

 https://s.apache.org/beam-fn-api-metrics

 Thanks for sharing.


 On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
 rober...@google.com> wrote:
 > Thanks for the nice writeup. I added some comments.
 >
 > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato 
 wrote:
 >>
 >> Hello beam community,
 >>
 >> Thank you everyone for your initial feedback 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-12 Thread Robert Bradshaw
By "type" of metric, I mean both the data types (including their encoding)
and accumulator strategy. So sumint would be a type, as would
double-distribution.

On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers  wrote:

> When you say type do you mean accumulator type, result type, or
> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
> meanlong, etc?
>
> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw  wrote:
>
>> Fully custom metric types is the "more speculative and difficult" feature
>> that I was proposing we kick down the road (and may never get to). What I'm
>> suggesting is that we support custom metrics of standard type.
>>
>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers 
>> wrote:
>>
>>> The metric api is designed to prevent user defined metric types based on
>>> the fact they just weren't used enough to justify support.
>>>
>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>> just need the ability for the standard set plus any special system metrivs?
>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw 
>>> wrote:
>>>
 Thanks. I think this has simplified things.

 One thing that has occurred to me is that we're conflating the idea of
 custom metrics and custom metric types. I would propose the MetricSpec
 field be augmented with an additional field "type" which is a urn
 specifying the type of metric it is (i.e. the contents of its payload, as
 well as the form of aggregation). Summing or maxing over ints would be a
 typical example. Though we could pursue making this opaque to the runner in
 the long run, that's a more speculative (and difficult) feature to tackle.
 This would allow the runner to at least aggregate and report/return to the
 SDK metrics that it did not itself understand the semantic meaning of. (It
 would probably simplify much of the specialization in the runner itself for
 metrics that it *did* understand as well.)

 In addition, rather than having UserMetricOfTypeX for every type X one
 would have a single URN for UserMetric and it spec would designate the type
 and payload designate the (qualified) name.

 - Robert



 On Wed, Apr 11, 2018 at 5:12 PM Alex Amato  wrote:

> Thank you everyone for your feedback so far.
> I have made a revision today which is to make all metrics refer to a
> primary entity, so I have restructured some of the protos a little bit.
>
> The point of this change was to futureproof the possibility of
> allowing custom user metrics, with custom aggregation functions for its
> metric updates.
> Now that each metric has an aggregation_entity associated with it
> (e.g. PCollection, PTransform), we can design an approach which forwards
> the opaque bytes metric updates, without deserializing them. These are
> forwarded to user provided code which then would deserialize the metric
> update payloads and perform the custom aggregations.
>
> I think it has also simplified some of the URN metric protos, as they
> do not need to keep track of ptransform names inside themselves now. The
> result is simpler structures, for the metrics as the entities are pulled
> outside of the metric.
>
> I have mentioned this in the doc now, and wanted to draw attention to
> this particular revision.
>
>
>
> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato  wrote:
>
>> I've gathered a lot of feedback so far and want to make a decision by
>> Friday, and begin working on related PRs next week.
>>
>> Please make sure that you provide your feedback before then and I
>> will post the final decisions made to this thread Friday afternoon.
>>
>>
>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía 
>> wrote:
>>
>>> Nice, I created a short link so people can refer to it easily in
>>> future discussions, website, etc.
>>>
>>> https://s.apache.org/beam-fn-api-metrics
>>>
>>> Thanks for sharing.
>>>
>>>
>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>> rober...@google.com> wrote:
>>> > Thanks for the nice writeup. I added some comments.
>>> >
>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato 
>>> wrote:
>>> >>
>>> >> Hello beam community,
>>> >>
>>> >> Thank you everyone for your initial feedback on this proposal so
>>> far. I
>>> >> have made some revisions based on the feedback. There were some
>>> larger
>>> >> questions asking about alternatives. For each of these I have
>>> added a
>>> >> section tagged with [Alternatives] and discussed my
>>> recommendation as well
>>> >> as as few other choices we considered.
>>> >>
>>> >> I would appreciate more 

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-11 Thread Ben Chambers
When you say type do you mean accumulator type, result type, or accumulator
strategy? Specifically, what is the "type" of sumint, sumlong, meanlong,
etc?

On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw  wrote:

> Fully custom metric types is the "more speculative and difficult" feature
> that I was proposing we kick down the road (and may never get to). What I'm
> suggesting is that we support custom metrics of standard type.
>
> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers  wrote:
>
>> The metric api is designed to prevent user defined metric types based on
>> the fact they just weren't used enough to justify support.
>>
>> Is there a reason we are bringing that complexity back? Shouldn't we just
>> need the ability for the standard set plus any special system metrivs?
>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw 
>> wrote:
>>
>>> Thanks. I think this has simplified things.
>>>
>>> One thing that has occurred to me is that we're conflating the idea of
>>> custom metrics and custom metric types. I would propose the MetricSpec
>>> field be augmented with an additional field "type" which is a urn
>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>> well as the form of aggregation). Summing or maxing over ints would be a
>>> typical example. Though we could pursue making this opaque to the runner in
>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>> This would allow the runner to at least aggregate and report/return to the
>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>> would probably simplify much of the specialization in the runner itself for
>>> metrics that it *did* understand as well.)
>>>
>>> In addition, rather than having UserMetricOfTypeX for every type X one
>>> would have a single URN for UserMetric and it spec would designate the type
>>> and payload designate the (qualified) name.
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato  wrote:
>>>
 Thank you everyone for your feedback so far.
 I have made a revision today which is to make all metrics refer to a
 primary entity, so I have restructured some of the protos a little bit.

 The point of this change was to futureproof the possibility of allowing
 custom user metrics, with custom aggregation functions for its metric
 updates.
 Now that each metric has an aggregation_entity associated with it (e.g.
 PCollection, PTransform), we can design an approach which forwards the
 opaque bytes metric updates, without deserializing them. These are
 forwarded to user provided code which then would deserialize the metric
 update payloads and perform the custom aggregations.

 I think it has also simplified some of the URN metric protos, as they
 do not need to keep track of ptransform names inside themselves now. The
 result is simpler structures, for the metrics as the entities are pulled
 outside of the metric.

 I have mentioned this in the doc now, and wanted to draw attention to
 this particular revision.



 On Tue, Apr 10, 2018 at 9:53 AM Alex Amato  wrote:

> I've gathered a lot of feedback so far and want to make a decision by
> Friday, and begin working on related PRs next week.
>
> Please make sure that you provide your feedback before then and I will
> post the final decisions made to this thread Friday afternoon.
>
>
> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:
>
>> Nice, I created a short link so people can refer to it easily in
>> future discussions, website, etc.
>>
>> https://s.apache.org/beam-fn-api-metrics
>>
>> Thanks for sharing.
>>
>>
>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
>> wrote:
>> > Thanks for the nice writeup. I added some comments.
>> >
>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato 
>> wrote:
>> >>
>> >> Hello beam community,
>> >>
>> >> Thank you everyone for your initial feedback on this proposal so
>> far. I
>> >> have made some revisions based on the feedback. There were some
>> larger
>> >> questions asking about alternatives. For each of these I have
>> added a
>> >> section tagged with [Alternatives] and discussed my recommendation
>> as well
>> >> as as few other choices we considered.
>> >>
>> >> I would appreciate more feedback on the revised proposal. Please
>> take
>> >> another look and let me know
>> >>
>> >>
>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>> >>
>> >> Etienne, I would appreciate it if you could please take another
>> look after
>> >> the revisions I have made as well.

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-11 Thread Robert Bradshaw
Fully custom metric types is the "more speculative and difficult" feature
that I was proposing we kick down the road (and may never get to). What I'm
suggesting is that we support custom metrics of standard type.

On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers  wrote:

> The metric api is designed to prevent user defined metric types based on
> the fact they just weren't used enough to justify support.
>
> Is there a reason we are bringing that complexity back? Shouldn't we just
> need the ability for the standard set plus any special system metrivs?
> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw  wrote:
>
>> Thanks. I think this has simplified things.
>>
>> One thing that has occurred to me is that we're conflating the idea of
>> custom metrics and custom metric types. I would propose the MetricSpec
>> field be augmented with an additional field "type" which is a urn
>> specifying the type of metric it is (i.e. the contents of its payload, as
>> well as the form of aggregation). Summing or maxing over ints would be a
>> typical example. Though we could pursue making this opaque to the runner in
>> the long run, that's a more speculative (and difficult) feature to tackle.
>> This would allow the runner to at least aggregate and report/return to the
>> SDK metrics that it did not itself understand the semantic meaning of. (It
>> would probably simplify much of the specialization in the runner itself for
>> metrics that it *did* understand as well.)
>>
>> In addition, rather than having UserMetricOfTypeX for every type X one
>> would have a single URN for UserMetric and it spec would designate the type
>> and payload designate the (qualified) name.
>>
>> - Robert
>>
>>
>>
>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato  wrote:
>>
>>> Thank you everyone for your feedback so far.
>>> I have made a revision today which is to make all metrics refer to a
>>> primary entity, so I have restructured some of the protos a little bit.
>>>
>>> The point of this change was to futureproof the possibility of allowing
>>> custom user metrics, with custom aggregation functions for its metric
>>> updates.
>>> Now that each metric has an aggregation_entity associated with it (e.g.
>>> PCollection, PTransform), we can design an approach which forwards the
>>> opaque bytes metric updates, without deserializing them. These are
>>> forwarded to user provided code which then would deserialize the metric
>>> update payloads and perform the custom aggregations.
>>>
>>> I think it has also simplified some of the URN metric protos, as they do
>>> not need to keep track of ptransform names inside themselves now. The
>>> result is simpler structures, for the metrics as the entities are pulled
>>> outside of the metric.
>>>
>>> I have mentioned this in the doc now, and wanted to draw attention to
>>> this particular revision.
>>>
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato  wrote:
>>>
 I've gathered a lot of feedback so far and want to make a decision by
 Friday, and begin working on related PRs next week.

 Please make sure that you provide your feedback before then and I will
 post the final decisions made to this thread Friday afternoon.


 On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:

> Nice, I created a short link so people can refer to it easily in
> future discussions, website, etc.
>
> https://s.apache.org/beam-fn-api-metrics
>
> Thanks for sharing.
>
>
> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
> wrote:
> > Thanks for the nice writeup. I added some comments.
> >
> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato 
> wrote:
> >>
> >> Hello beam community,
> >>
> >> Thank you everyone for your initial feedback on this proposal so
> far. I
> >> have made some revisions based on the feedback. There were some
> larger
> >> questions asking about alternatives. For each of these I have added
> a
> >> section tagged with [Alternatives] and discussed my recommendation
> as well
> >> as as few other choices we considered.
> >>
> >> I would appreciate more feedback on the revised proposal. Please
> take
> >> another look and let me know
> >>
> >>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
> >>
> >> Etienne, I would appreciate it if you could please take another
> look after
> >> the revisions I have made as well.
> >>
> >> Thanks again,
> >> Alex
> >>
> >
>



Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-11 Thread Ben Chambers
The metric api is designed to prevent user defined metric types based on
the fact they just weren't used enough to justify support.

Is there a reason we are bringing that complexity back? Shouldn't we just
need the ability for the standard set plus any special system metrivs?

On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw  wrote:

> Thanks. I think this has simplified things.
>
> One thing that has occurred to me is that we're conflating the idea of
> custom metrics and custom metric types. I would propose the MetricSpec
> field be augmented with an additional field "type" which is a urn
> specifying the type of metric it is (i.e. the contents of its payload, as
> well as the form of aggregation). Summing or maxing over ints would be a
> typical example. Though we could pursue making this opaque to the runner in
> the long run, that's a more speculative (and difficult) feature to tackle.
> This would allow the runner to at least aggregate and report/return to the
> SDK metrics that it did not itself understand the semantic meaning of. (It
> would probably simplify much of the specialization in the runner itself for
> metrics that it *did* understand as well.)
>
> In addition, rather than having UserMetricOfTypeX for every type X one
> would have a single URN for UserMetric and it spec would designate the type
> and payload designate the (qualified) name.
>
> - Robert
>
>
>
> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato  wrote:
>
>> Thank you everyone for your feedback so far.
>> I have made a revision today which is to make all metrics refer to a
>> primary entity, so I have restructured some of the protos a little bit.
>>
>> The point of this change was to futureproof the possibility of allowing
>> custom user metrics, with custom aggregation functions for its metric
>> updates.
>> Now that each metric has an aggregation_entity associated with it (e.g.
>> PCollection, PTransform), we can design an approach which forwards the
>> opaque bytes metric updates, without deserializing them. These are
>> forwarded to user provided code which then would deserialize the metric
>> update payloads and perform the custom aggregations.
>>
>> I think it has also simplified some of the URN metric protos, as they do
>> not need to keep track of ptransform names inside themselves now. The
>> result is simpler structures, for the metrics as the entities are pulled
>> outside of the metric.
>>
>> I have mentioned this in the doc now, and wanted to draw attention to
>> this particular revision.
>>
>>
>>
>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato  wrote:
>>
>>> I've gathered a lot of feedback so far and want to make a decision by
>>> Friday, and begin working on related PRs next week.
>>>
>>> Please make sure that you provide your feedback before then and I will
>>> post the final decisions made to this thread Friday afternoon.
>>>
>>>
>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:
>>>
 Nice, I created a short link so people can refer to it easily in
 future discussions, website, etc.

 https://s.apache.org/beam-fn-api-metrics

 Thanks for sharing.


 On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
 wrote:
 > Thanks for the nice writeup. I added some comments.
 >
 > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:
 >>
 >> Hello beam community,
 >>
 >> Thank you everyone for your initial feedback on this proposal so
 far. I
 >> have made some revisions based on the feedback. There were some
 larger
 >> questions asking about alternatives. For each of these I have added a
 >> section tagged with [Alternatives] and discussed my recommendation
 as well
 >> as as few other choices we considered.
 >>
 >> I would appreciate more feedback on the revised proposal. Please take
 >> another look and let me know
 >>
 >>
 https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
 >>
 >> Etienne, I would appreciate it if you could please take another look
 after
 >> the revisions I have made as well.
 >>
 >> Thanks again,
 >> Alex
 >>
 >

>>>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-11 Thread Alex Amato
Thank you everyone for your feedback so far.
I have made a revision today which is to make all metrics refer to a
primary entity, so I have restructured some of the protos a little bit.

The point of this change was to futureproof the possibility of allowing
custom user metrics, with custom aggregation functions for its metric
updates.
Now that each metric has an aggregation_entity associated with it (e.g.
PCollection, PTransform), we can design an approach which forwards the
opaque bytes metric updates, without deserializing them. These are
forwarded to user provided code which then would deserialize the metric
update payloads and perform the custom aggregations.

I think it has also simplified some of the URN metric protos, as they do
not need to keep track of ptransform names inside themselves now. The
result is simpler structures, for the metrics as the entities are pulled
outside of the metric.

I have mentioned this in the doc now, and wanted to draw attention to this
particular revision.



On Tue, Apr 10, 2018 at 9:53 AM Alex Amato  wrote:

> I've gathered a lot of feedback so far and want to make a decision by
> Friday, and begin working on related PRs next week.
>
> Please make sure that you provide your feedback before then and I will
> post the final decisions made to this thread Friday afternoon.
>
>
> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:
>
>> Nice, I created a short link so people can refer to it easily in
>> future discussions, website, etc.
>>
>> https://s.apache.org/beam-fn-api-metrics
>>
>> Thanks for sharing.
>>
>>
>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
>> wrote:
>> > Thanks for the nice writeup. I added some comments.
>> >
>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:
>> >>
>> >> Hello beam community,
>> >>
>> >> Thank you everyone for your initial feedback on this proposal so far. I
>> >> have made some revisions based on the feedback. There were some larger
>> >> questions asking about alternatives. For each of these I have added a
>> >> section tagged with [Alternatives] and discussed my recommendation as
>> well
>> >> as as few other choices we considered.
>> >>
>> >> I would appreciate more feedback on the revised proposal. Please take
>> >> another look and let me know
>> >>
>> >>
>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>> >>
>> >> Etienne, I would appreciate it if you could please take another look
>> after
>> >> the revisions I have made as well.
>> >>
>> >> Thanks again,
>> >> Alex
>> >>
>> >
>>
>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-10 Thread Alex Amato
I've gathered a lot of feedback so far and want to make a decision by
Friday, and begin working on related PRs next week.

Please make sure that you provide your feedback before then and I will post
the final decisions made to this thread Friday afternoon.

On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía  wrote:

> Nice, I created a short link so people can refer to it easily in
> future discussions, website, etc.
>
> https://s.apache.org/beam-fn-api-metrics
>
> Thanks for sharing.
>
>
> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw 
> wrote:
> > Thanks for the nice writeup. I added some comments.
> >
> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:
> >>
> >> Hello beam community,
> >>
> >> Thank you everyone for your initial feedback on this proposal so far. I
> >> have made some revisions based on the feedback. There were some larger
> >> questions asking about alternatives. For each of these I have added a
> >> section tagged with [Alternatives] and discussed my recommendation as
> well
> >> as as few other choices we considered.
> >>
> >> I would appreciate more feedback on the revised proposal. Please take
> >> another look and let me know
> >>
> >>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
> >>
> >> Etienne, I would appreciate it if you could please take another look
> after
> >> the revisions I have made as well.
> >>
> >> Thanks again,
> >> Alex
> >>
> >
>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-05 Thread Ismaël Mejía
Nice, I created a short link so people can refer to it easily in
future discussions, website, etc.

https://s.apache.org/beam-fn-api-metrics

Thanks for sharing.


On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw  wrote:
> Thanks for the nice writeup. I added some comments.
>
> On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:
>>
>> Hello beam community,
>>
>> Thank you everyone for your initial feedback on this proposal so far. I
>> have made some revisions based on the feedback. There were some larger
>> questions asking about alternatives. For each of these I have added a
>> section tagged with [Alternatives] and discussed my recommendation as well
>> as as few other choices we considered.
>>
>> I would appreciate more feedback on the revised proposal. Please take
>> another look and let me know
>>
>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>
>> Etienne, I would appreciate it if you could please take another look after
>> the revisions I have made as well.
>>
>> Thanks again,
>> Alex
>>
>


Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

2018-04-04 Thread Robert Bradshaw
Thanks for the nice writeup. I added some comments.

On Wed, Apr 4, 2018 at 1:53 PM Alex Amato  wrote:

> Hello beam community,
>
> Thank you everyone for your initial feedback on this proposal so far. I
> have made some revisions based on the feedback. There were some larger
> questions asking about alternatives. For each of these I have added a
> section tagged with [Alternatives] and discussed my recommendation as well
> as as few other choices we considered.
>
> I would appreciate more feedback on the revised proposal. Please take
> another look and let me know
>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>
> Etienne, I would appreciate it if you could please take another look after
> the revisions I have made as well.
>
> Thanks again,
> Alex
>
>