+1 for the new metrics design, I especially like the Dropwizard-like API which will probably be more familiar to users -Aggregators can be misleading for example to Spark users, Spark Aggregators are Beam Accumulators and Spark Accumulators are Beam Aggregators...
Just take into account that runners could take a while to follow through so, I'll point the obvious just so it's noted here: Once the new metrics API is in, we can annotate Aggregators as @Deprecated, and allow a fitting grace-period for runner developers to support the new Metrics API. Thanks, Amit On Wed, Oct 12, 2016 at 9:22 PM Robert Bradshaw <[email protected]> wrote: > +1 to the new metrics design. I strongly favor B as well. > > On Wed, Oct 12, 2016 at 10:54 AM, Kenneth Knowles > <[email protected]> wrote: > > Correction: In my eagerness to see the end of aggregators, I mistook the > > intention. Both A and B leave aggregators in place until there is a > > replacement. In which case, I am strongly in favor of B. As soon as we > can > > remove aggregators, I think we should. > > > > On Wed, Oct 12, 2016 at 10:48 AM Kenneth Knowles <[email protected]> wrote: > > > >> Huzzah! This is IMO a really great change. I agree that we can get > >> something in to allow work to continue, and improve the API as we learn. > >> > >> On Wed, Oct 12, 2016 at 10:20 AM Ben Chambers > <[email protected]> > >> wrote: > >> > >> 3. One open question is what to do with Aggregators. In the doc I > mentioned > >> > >> that long term I'd like to consider whether we can improve Aggregators > to > >> be a better fit for the model by supporting windowing and allowing them > to > >> serve as input for future steps. In the interim it's not clear what we > >> should do with them. The two obvious (and extreme) options seem to be: > >> > >> > >> > >> Option A: Do nothing, leave aggregators as they are until we revisit. > >> > >> > >> Option B: Remove aggregators from the SDK until we revisit. > >> > >> I'd like to suggest removing Aggregators once the existing runners have > >> reasonable support for Metrics. Doing so reduces the surface area we > need > >> to maintain/support and simplifies other changes being made. It will > also > >> allow us to revisit them from a clean slate. > >> > >> > >> +1 to removing aggregators, either of A or B. The new metrics design > >> addresses aggregator use cases as well or better. > >> > >> So A vs B is a choice of whether we have a gap with no aggregator or > >> metrics-like functionality. I think that is perhaps a bit of a bummer > for > >> users, and we will likely port over the runner code for it, so we > wouldn't > >> want to actually delete it, right? Can we do it in a week or two? > >> > >> One thing motivating me to do this quickly: Currently the new DoFn does > >> not have its own implementation of aggregators, but leverages that of > >> OldDoFn, so we cannot remove OldDoFn until either (1) new DoFn > >> re-implements the aggregator instantiation and worker-side delegation > (not > >> hard, but it is throwaway code) or (2) aggregators are removed. This > >> dependency also makes running the new DoFn directly (required for the > state > >> API) a bit more annoying. > >> >
