Thanks Aljoscha and Max,

The fact that this only happens in failure scenarios is good to know, thanks.

Perhaps I'm unclear on what an “Aggregator” is. I assumed that a line such as 
the following:

        PCollection<KV<String, Double>> meanByName = 
dataPoints.apply(Mean.<String, Double>perKey());

…would be considered an Aggregator, since it applies a mean aggregation over a 
window. Is that correct, with respect to the Beam terminology? If not, what 
would an example of an Aggregator be?

Thanks,

Bill 

> On Mar 22, 2016, at 2:18 PM, Mark Shields <[email protected]> wrote:
> 
> Google's streaming implementation has the same property: counters are not 
> committed with work and so updates may sometimes be lost (ie undercounted), 
> or may be replayed (ie overcounted). It's a tradeoff between having 
> low-latency and cheep monitoring against coherence with the underlying 
> processing.
> 
> On Tue, Mar 22, 2016 at 1:57 AM, Aljoscha Krettek <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi,
> in Flink the accumulators/aggregators are not faul-tolerant. In case of a 
> failure the job will be restarted but the accumulators will start from 
> scratch. Initially they were only meant as a rough way to gauge the progress 
> that a job is making. People should not rely on them for accurate numbers 
> right now.
> 
> Cheers,
> Aljoscha
> > On 21 Mar 2016, at 20:37, William McCarthy <[email protected] 
> > <mailto:[email protected]>> wrote:
> >
> > Hi,
> >
> > I just had a look at the capability matrix here: 
> > http://beam.incubator.apache.org/capability-matrix/ 
> > <http://beam.incubator.apache.org/capability-matrix/> . I really like it, 
> > as it gives a nice summary of the current state of implementation 
> > completeness for the different runners.
> >
> > I had one follow-up question, regarding the cell at the intersection of the 
> > Aggregators row and the Apache Flink column, with this content: "In 
> > streaming mode, Aggregators may undercount”. Can you give me some ideas 
> > about what this means? In what circumstances might this happen? Are there 
> > some mitigation strategies that are appropriate?
> >
> > Thanks,
> >
> > Bill
> 
> 

Reply via email to