Thanks Aljoscha and Max,
The fact that this only happens in failure scenarios is good to know, thanks.
Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line such as
the following:
PCollection<KV<String, Double>> meanByName =
dataPoints.apply(Mean.<String, Double>perKey());
…would be considered an Aggregator, since it applies a mean aggregation over a
window. Is that correct, with respect to the Beam terminology? If not, what
would an example of an Aggregator be?
Thanks,
Bill
> On Mar 22, 2016, at 2:18 PM, Mark Shields <[email protected]> wrote:
>
> Google's streaming implementation has the same property: counters are not
> committed with work and so updates may sometimes be lost (ie undercounted),
> or may be replayed (ie overcounted). It's a tradeoff between having
> low-latency and cheep monitoring against coherence with the underlying
> processing.
>
> On Tue, Mar 22, 2016 at 1:57 AM, Aljoscha Krettek <[email protected]
> <mailto:[email protected]>> wrote:
> Hi,
> in Flink the accumulators/aggregators are not faul-tolerant. In case of a
> failure the job will be restarted but the accumulators will start from
> scratch. Initially they were only meant as a rough way to gauge the progress
> that a job is making. People should not rely on them for accurate numbers
> right now.
>
> Cheers,
> Aljoscha
> > On 21 Mar 2016, at 20:37, William McCarthy <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> > Hi,
> >
> > I just had a look at the capability matrix here:
> > http://beam.incubator.apache.org/capability-matrix/
> > <http://beam.incubator.apache.org/capability-matrix/> . I really like it,
> > as it gives a nice summary of the current state of implementation
> > completeness for the different runners.
> >
> > I had one follow-up question, regarding the cell at the intersection of the
> > Aggregators row and the Apache Flink column, with this content: "In
> > streaming mode, Aggregators may undercount”. Can you give me some ideas
> > about what this means? In what circumstances might this happen? Are there
> > some mitigation strategies that are appropriate?
> >
> > Thanks,
> >
> > Bill
>
>