I am not opposed to removing other data types, though they are extra convenience for user.
In Scott's example above, if the metric is a counter, what are the guarantees provided? E.g. would it match the global count using GBK? If yes, then gauges (especially per-key gauges) can be very useful too (e.g. backlog for each Kafka partition/split). On Fri, Apr 6, 2018 at 10:01 AM Robert Bradshaw <rober...@google.com> wrote: > A String API makes it clear(er) that the values will not be aggregated in > any way across workers. I don't think retaining both APIs (except for > possibly some short migration period) worthwhile. On another note, I still > find the distributed gague API to be a bit odd in general. > > On Fri, Apr 6, 2018 at 9:46 AM Raghu Angadi <rang...@google.com> wrote: > >> I would be in favor of replacing the existing Gauge.set(long) API with >>> the String version and removing the old one. This would be a breaking >>> change. However this is a relatively new API and is still marked >>> @Experimental. Keeping the old API would retain the potential confusion. >>> It's better to simplify the API surface: having two APIs makes it less >>> clear which one users should choose. >> >> >> Supporting additional data types sounds good. But the above states string >> API will replace the existing API. I do not see how string API makes the >> semantics more clear. Semantically both are same to the user. >> >> On Fri, Apr 6, 2018 at 9:31 AM Pablo Estrada <pabl...@google.com> wrote: >> >>> Hi Ben : D >>> >>> Sure, that's reasonable. And perhaps I started the discussion in the >>> wrong direction. I'm not questioning the utility of Gauge metrics. >>> >>> What I'm saying is that Beam only supports integers,, but Gauges are >>> aggregated by dropping old values depending on their update times; so it >>> might be desirable to not restrict the data type to just integers. >>> >>> -P. >>> >>> On Fri, Apr 6, 2018 at 9:19 AM Ben Chambers <bchamb...@apache.org> >>> wrote: >>> >>>> See for instance how gauge metrics are handled in Prometheus, Datadog >>>> and Stackdriver monitoring. Gauges are perfect for use in distributed >>>> systems, they just need to be properly labeled. Perhaps we should apply a >>>> default tag or allow users to specify one. >>>> >>>> On Fri, Apr 6, 2018, 9:14 AM Ben Chambers <bchamb...@apache.org> wrote: >>>> >>>>> Some metrics backend label the value, for instance with the worker >>>>> that sent it. Then the aggregation is latest per label. This makes it >>>>> useful for holding values such as "memory usage" that need to hold current >>>>> value. >>>>> >>>>> On Fri, Apr 6, 2018, 9:00 AM Scott Wegner <sweg...@google.com> wrote: >>>>> >>>>>> +1 on the proposal to support a "String" gauge. >>>>>> >>>>>> To expand a bit, the current API doesn't make it clear that the gauge >>>>>> value is based on local state. If a runner chooses to parallelize a DoFn >>>>>> across many workers, each worker will have its own local Gauge metric and >>>>>> its updates will overwrite other values. For example, from the API it >>>>>> looks >>>>>> like you could use a gauge to implement your own element count metric: >>>>>> >>>>>> long count = 0; >>>>>> @ProcessElement >>>>>> public void processElement(ProcessContext c) { >>>>>> myGauge.set(++count); >>>>>> c.output(c.element()); >>>>>> } >>>>>> >>>>>> This looks correct, but each worker has their own local 'count' >>>>>> field, and gauge metric updates from parallel workers will overwrite each >>>>>> other rather than get aggregated. So the final value would be "the number >>>>>> of elements processed on one of the workers". (The correct implementation >>>>>> uses a Counter metric). >>>>>> >>>>>> I would be in favor of replacing the existing Gauge.set(long) API >>>>>> with the String version and removing the old one. This would be a >>>>>> breaking >>>>>> change. However this is a relatively new API and is still marked >>>>>> @Experimental. Keeping the old API would retain the potential confusion. >>>>>> It's better to simplify the API surface: having two APIs makes it less >>>>>> clear which one users should choose. >>>>>> >>>>>> On Fri, Apr 6, 2018 at 8:28 AM Pablo Estrada <pabl...@google.com> >>>>>> wrote: >>>>>> >>>>>>> Hello all, >>>>>>> As I was working on adding support for Gauges in Dataflow, some >>>>>>> noted that Gauge is a fairly unusual kind of metric for a distributed >>>>>>> environment, since many workers will report different values and stomp >>>>>>> on >>>>>>> each other's all the time. >>>>>>> >>>>>>> We also looked at Flink and Dropwizard Gauge metrics [1][2], and we >>>>>>> found that these use generics, and Flink explicitly mentions that a >>>>>>> toString implementation is required[3]. >>>>>>> >>>>>>> With that in mind, I'm thinking that it might make sense to 1) >>>>>>> expand Gauge to support string values (keep int-based API for backwards >>>>>>> compatibility), and migrate it to use string behind the covers. >>>>>>> >>>>>>> What does everyone think about this? >>>>>>> >>>>>>> Best >>>>>>> -P. >>>>>>> >>>>>>> 1 - >>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html#metric-types >>>>>>> 2 - https://metrics.dropwizard.io/3.1.0/manual/core/#gauges >>>>>>> 3 - >>>>>>> https://github.com/apache/flink/blob/master/docs/monitoring/metrics.md#gauge >>>>>>> JIRA issue for Gauge metrics - >>>>>>> https://issues.apache.org/jira/browse/BEAM-1616 >>>>>>> -- >>>>>>> Got feedback? go/pabloem-feedback >>>>>>> <https://goto.google.com/pabloem-feedback> >>>>>>> >>>>>> -- >>>>>> >>>>>> >>>>>> Got feedback? http://go/swegner-feedback >>>>>> >>>>> -- >>> Got feedback? go/pabloem-feedback >>> <https://goto.google.com/pabloem-feedback> >>> >>