Hi Kenneth,

Sure! I've created https://issues.apache.org/jira/browse/BEAM-3117 to track
this.

Derek

On Thu, Oct 26, 2017 at 9:24 PM, Kenneth Knowles <k...@google.com> wrote:

> Hi Derek,
>
> I agree, that phrasing is simply incorrect and mightily confusing. Would
> you be up for filing a JIRA with some hyperlinks to the pages that say that?
>
> Kenn
>
> On Sun, Oct 22, 2017 at 9:54 AM, Derek Hao Hu <phoenixin...@gmail.com>
> wrote:
>
>> Thanks Kenneth! I sort of feel the notions of bundles and windows are a
>> bit confusing in Beam.
>>
>> For example, here is what the Beam Programming Guide says:
>>
>> "When performing an operation that groups elements in an unbounded
>> PCollection, Beam requires a concept called *windowing* to divide a
>> continuously updating data set into logical windows of finite size. Beam
>> processes each window as a bundle, and processing continues as the data set
>> is generated."
>>
>> So then I would assume "bundles" and "windows" are terms that can be used
>> almost interchangeably.
>>
>> Do you know if there's any good posts / documentations about bundles?
>>
>> Cheers,
>>
>> Derek
>>
>> On Wed, Oct 18, 2017 at 6:59 AM, Kenneth Knowles <k...@google.com> wrote:
>>
>>> Bundles are decidedly not windows, so let's keep the two terms separate.
>>> It sounds like you are asking about bundles.
>>>
>>> The bundle size is a performance tuning parameter and is arbitrarily
>>> chosen arbitrarily and dynamically chosen by a runner. The runner chooses
>>> based on its best effort to amortize @StartBundle/@FinishBundle operations
>>> across multiple @ProcessElement/@OnTimer calls. Your code must yield
>>> correct results for for any bundling - you should be implementing
>>> per-element logic, where @StartBundle/@FinishBundle are implementation
>>> details.
>>>
>>> Kenn
>>>
>>> On Tue, Oct 17, 2017 at 5:37 PM, Derek Hao Hu <phoenixin...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there any more detailed explanation on how Beam chooses the window
>>>> size (bundle size) in streaming mode? It seems there is no clear answer in
>>>> the [Beam Programming Guide](https://beam.apache.org
>>>> /documentation/programming-guide/) and I can't find how PubsubIO
>>>> implements this windowing strategy as well. :(
>>>>
>>>> Could someone kindly provide some pointers? Thanks!
>>>> --
>>>> Derek Hao Hu
>>>>
>>>> Software Engineer | Snapchat
>>>> Snap Inc.
>>>>
>>>
>>>
>>
>>
>> --
>> Derek Hao Hu
>>
>> Software Engineer | Snapchat
>> Snap Inc.
>>
>
>


-- 
Derek Hao Hu

Software Engineer | Snapchat
Snap Inc.

Reply via email to