Huzzah! This is IMO a really great change. I agree that we can get something in to allow work to continue, and improve the API as we learn.
On Wed, Oct 12, 2016 at 10:20 AM Ben Chambers <[email protected]> wrote: > 3. One open question is what to do with Aggregators. In the doc I mentioned that long term I'd like to consider whether we can improve Aggregators to > be a better fit for the model by supporting windowing and allowing them to > serve as input for future steps. In the interim it's not clear what we > should do with them. The two obvious (and extreme) options seem to be: > Option A: Do nothing, leave aggregators as they are until we revisit. > > Option B: Remove aggregators from the SDK until we revisit. > > I'd like to suggest removing Aggregators once the existing runners have > reasonable support for Metrics. Doing so reduces the surface area we need > to maintain/support and simplifies other changes being made. It will also > allow us to revisit them from a clean slate. > +1 to removing aggregators, either of A or B. The new metrics design addresses aggregator use cases as well or better. So A vs B is a choice of whether we have a gap with no aggregator or metrics-like functionality. I think that is perhaps a bit of a bummer for users, and we will likely port over the runner code for it, so we wouldn't want to actually delete it, right? Can we do it in a week or two? One thing motivating me to do this quickly: Currently the new DoFn does not have its own implementation of aggregators, but leverages that of OldDoFn, so we cannot remove OldDoFn until either (1) new DoFn re-implements the aggregator instantiation and worker-side delegation (not hard, but it is throwaway code) or (2) aggregators are removed. This dependency also makes running the new DoFn directly (required for the state API) a bit more annoying.
