Hi Kenneth, Sure! I've created https://issues.apache.org/jira/browse/BEAM-3117 to track this.
Derek On Thu, Oct 26, 2017 at 9:24 PM, Kenneth Knowles <k...@google.com> wrote: > Hi Derek, > > I agree, that phrasing is simply incorrect and mightily confusing. Would > you be up for filing a JIRA with some hyperlinks to the pages that say that? > > Kenn > > On Sun, Oct 22, 2017 at 9:54 AM, Derek Hao Hu <phoenixin...@gmail.com> > wrote: > >> Thanks Kenneth! I sort of feel the notions of bundles and windows are a >> bit confusing in Beam. >> >> For example, here is what the Beam Programming Guide says: >> >> "When performing an operation that groups elements in an unbounded >> PCollection, Beam requires a concept called *windowing* to divide a >> continuously updating data set into logical windows of finite size. Beam >> processes each window as a bundle, and processing continues as the data set >> is generated." >> >> So then I would assume "bundles" and "windows" are terms that can be used >> almost interchangeably. >> >> Do you know if there's any good posts / documentations about bundles? >> >> Cheers, >> >> Derek >> >> On Wed, Oct 18, 2017 at 6:59 AM, Kenneth Knowles <k...@google.com> wrote: >> >>> Bundles are decidedly not windows, so let's keep the two terms separate. >>> It sounds like you are asking about bundles. >>> >>> The bundle size is a performance tuning parameter and is arbitrarily >>> chosen arbitrarily and dynamically chosen by a runner. The runner chooses >>> based on its best effort to amortize @StartBundle/@FinishBundle operations >>> across multiple @ProcessElement/@OnTimer calls. Your code must yield >>> correct results for for any bundling - you should be implementing >>> per-element logic, where @StartBundle/@FinishBundle are implementation >>> details. >>> >>> Kenn >>> >>> On Tue, Oct 17, 2017 at 5:37 PM, Derek Hao Hu <phoenixin...@gmail.com> >>> wrote: >>> >>>> Hi, >>>> >>>> Is there any more detailed explanation on how Beam chooses the window >>>> size (bundle size) in streaming mode? It seems there is no clear answer in >>>> the [Beam Programming Guide](https://beam.apache.org >>>> /documentation/programming-guide/) and I can't find how PubsubIO >>>> implements this windowing strategy as well. :( >>>> >>>> Could someone kindly provide some pointers? Thanks! >>>> -- >>>> Derek Hao Hu >>>> >>>> Software Engineer | Snapchat >>>> Snap Inc. >>>> >>> >>> >> >> >> -- >> Derek Hao Hu >> >> Software Engineer | Snapchat >> Snap Inc. >> > > -- Derek Hao Hu Software Engineer | Snapchat Snap Inc.