Correction: In my eagerness to see the end of aggregators, I mistook the intention. Both A and B leave aggregators in place until there is a replacement. In which case, I am strongly in favor of B. As soon as we can remove aggregators, I think we should.
On Wed, Oct 12, 2016 at 10:48 AM Kenneth Knowles <[email protected]> wrote: > Huzzah! This is IMO a really great change. I agree that we can get > something in to allow work to continue, and improve the API as we learn. > > On Wed, Oct 12, 2016 at 10:20 AM Ben Chambers <[email protected]> > wrote: > > 3. One open question is what to do with Aggregators. In the doc I mentioned > > that long term I'd like to consider whether we can improve Aggregators to > be a better fit for the model by supporting windowing and allowing them to > serve as input for future steps. In the interim it's not clear what we > should do with them. The two obvious (and extreme) options seem to be: > > > > Option A: Do nothing, leave aggregators as they are until we revisit. > > > Option B: Remove aggregators from the SDK until we revisit. > > I'd like to suggest removing Aggregators once the existing runners have > reasonable support for Metrics. Doing so reduces the surface area we need > to maintain/support and simplifies other changes being made. It will also > allow us to revisit them from a clean slate. > > > +1 to removing aggregators, either of A or B. The new metrics design > addresses aggregator use cases as well or better. > > So A vs B is a choice of whether we have a gap with no aggregator or > metrics-like functionality. I think that is perhaps a bit of a bummer for > users, and we will likely port over the runner code for it, so we wouldn't > want to actually delete it, right? Can we do it in a week or two? > > One thing motivating me to do this quickly: Currently the new DoFn does > not have its own implementation of aggregators, but leverages that of > OldDoFn, so we cannot remove OldDoFn until either (1) new DoFn > re-implements the aggregator instantiation and worker-side delegation (not > hard, but it is throwaway code) or (2) aggregators are removed. This > dependency also makes running the new DoFn directly (required for the state > API) a bit more annoying. >
