+1

Regards
JB

Le 31 août 2018 à 18:22, à 18:22, Lukasz Cwik <[email protected]> a écrit:
>That is possible, I'll take people's date/time suggestions and create a
>simple online poll with them.
>
>On Fri, Aug 31, 2018 at 2:22 AM Robert Bradshaw <[email protected]>
>wrote:
>
>> Thanks for taking this up. I added some comments to the doc. A
>> European-friendly time for discussion would be great.
>>
>> On Fri, Aug 31, 2018 at 3:14 AM Lukasz Cwik <[email protected]> wrote:
>>
>>> I came up with a proposal[1] for a progress model solely based off
>of the
>>> backlog and that splits should be based upon the remaining backlog
>we want
>>> the SDK to split at. I also give recommendations to runner authors
>as to
>>> how an autoscaling system could work based upon the measured
>backlog. A lot
>>> of discussions around progress reporting and splitting in the past
>has
>>> always been around finding an optimal solution, after reading a lot
>of
>>> information about work stealing, I don't believe there is a general
>>> solution and it really is upto SplittableDoFns to be well behaved. I
>did
>>> not do much work in classifying what a well behaved SplittableDoFn
>is
>>> though. Much of this work builds off ideas that Eugene had
>documented in
>>> the past[2].
>>>
>>> I could use the communities wide knowledge of different I/Os to see
>if
>>> computing the backlog is practical in the way that I'm suggesting
>and to
>>> gather people's feedback.
>>>
>>> If there is a lot of interest, I would like to hold a community
>video
>>> conference between Sept 10th and 14th about this topic. Please reply
>with
>>> your availability by Sept 6th if your interested.
>>>
>>> 1: https://s.apache.org/beam-bundles-backlog-splitting
>>> 2: https://s.apache.org/beam-breaking-fusion
>>>
>>> On Mon, Aug 13, 2018 at 10:21 AM Jean-Baptiste Onofré
><[email protected]>
>>> wrote:
>>>
>>>> Awesome !
>>>>
>>>> Thanks Luke !
>>>>
>>>> I plan to work with you and others on this one.
>>>>
>>>> Regards
>>>> JB
>>>> Le 13 août 2018, à 19:14, Lukasz Cwik <[email protected]> a écrit:
>>>>>
>>>>> I wanted to reach out that I will be continuing from where Eugene
>left
>>>>> off with SplittableDoFn. I know that many of you have done a bunch
>of work
>>>>> with IOs and/or runner integration for SplittableDoFn and would
>appreciate
>>>>> your help in advancing this awesome idea. If you have questions or
>things
>>>>> you want to get reviewed related to SplittableDoFn, feel free to
>send them
>>>>> my way or include me on anything SplittableDoFn related.
>>>>>
>>>>> I was part of several discussions with Eugene and I think the
>biggest
>>>>> outstanding design portion is to figure out how dynamic work
>rebalancing
>>>>> would play out with the portability APIs. This includes reporting
>of
>>>>> progress from within a bundle. I know that Eugene had shared some
>documents
>>>>> in this regard but the position / split models didn't work too
>cleanly in a
>>>>> unified sense for bounded and unbounded SplittableDoFns. It will
>likely
>>>>> take me awhile to gather my thoughts but could use your expertise
>as to how
>>>>> compatible these ideas are with respect to to IOs and runners
>>>>> Flink/Spark/Dataflow/Samza/Apex/... and obviously help during
>>>>> implementation.
>>>>>
>>>>

Reply via email to