Here is the link to join the discussion:
https://meet.google.com/idc-japs-hwf
Remember that it is this Friday Sept 14th from 11am-noon PST.



On Mon, Sep 10, 2018 at 7:30 AM Maximilian Michels <[email protected]> wrote:

> Thanks for moving forward with this, Lukasz!
>
> Unfortunately, can't make it on Friday but I'll sync with somebody on
> the call (e.g. Ryan) about your discussion.
>
> On 08.09.18 02:00, Lukasz Cwik wrote:
> > Thanks for everyone who wanted to fill out the doodle poll. The most
> > popular time was Friday Sept 14th from 11am-noon PST. I'll send out a
> > calendar invite and meeting link early next week.
> >
> > I have received a lot of feedback on the document and have addressed
> > some parts of it including:
> > * clarifying terminology
> > * processing skew due to some restrictions having their watermarks much
> > further behind then others affecting scheduling of bundles by runners
> > * external throttling & I/O wait overhead reporting to make sure we
> > don't overscale
> >
> > Areas that still need additional feedback and details are:
> > * reporting progress around the work that is done and is active
> > * more examples
> > * unbounded restrictions being caused by an unbounded number of splits
> > of existing unbounded restrictions (infinite work growth)
> > * whether we should be reporting this information at the PTransform
> > level or at the bundle level
> >
> >
> >
> > On Wed, Sep 5, 2018 at 1:53 PM Lukasz Cwik <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> >     Thanks to all those who have provided interest in this topic by the
> >     questions they have asked on the doc already and for those
> >     interested in having this discussion. I have setup this doodle to
> >     allow people to provide their availability:
> >     https://doodle.com/poll/nrw7w84255xnfwqy
> >
> >     I'll send out the chosen time based upon peoples availability and a
> >     Hangout link by end of day Friday so please mark your availability
> >     using the link above.
> >
> >     The agenda of the meeting will be as follows:
> >     * Overview of the proposal
> >     * Enumerate and discuss/answer questions brought up in the meeting
> >
> >     Note that all questions and any discussions/answers provided will be
> >     added to the doc for those who are unable to attend.
> >
> >     On Fri, Aug 31, 2018 at 9:47 AM Jean-Baptiste Onofré
> >     <[email protected] <mailto:[email protected]>> wrote:
> >
> >         +1
> >
> >         Regards
> >         JB
> >         Le 31 août 2018, à 18:22, Lukasz Cwik <[email protected]
> >         <mailto:[email protected]>> a écrit:
> >
> >             That is possible, I'll take people's date/time suggestions
> >             and create a simple online poll with them.
> >
> >             On Fri, Aug 31, 2018 at 2:22 AM Robert Bradshaw
> >             <[email protected] <mailto:[email protected]>> wrote:
> >
> >                 Thanks for taking this up. I added some comments to the
> >                 doc. A European-friendly time for discussion would
> >                 be great.
> >
> >                 On Fri, Aug 31, 2018 at 3:14 AM Lukasz Cwik
> >                 <[email protected] <mailto:[email protected]>> wrote:
> >
> >                     I came up with a proposal[1] for a progress model
> >                     solely based off of the backlog and that splits
> >                     should be based upon the remaining backlog we want
> >                     the SDK to split at. I also give recommendations to
> >                     runner authors as to how an autoscaling system could
> >                     work based upon the measured backlog. A lot of
> >                     discussions around progress reporting and splitting
> >                     in the past has always been around finding an
> >                     optimal solution, after reading a lot of information
> >                     about work stealing, I don't believe there is a
> >                     general solution and it really is upto
> >                     SplittableDoFns to be well behaved. I did not do
> >                     much work in classifying what a well behaved
> >                     SplittableDoFn is though. Much of this work builds
> >                     off ideas that Eugene had documented in the past[2].
> >
> >                     I could use the communities wide knowledge of
> >                     different I/Os to see if computing the backlog is
> >                     practical in the way that I'm suggesting and to
> >                     gather people's feedback.
> >
> >                     If there is a lot of interest, I would like to hold
> >                     a community video conference between Sept 10th and
> >                     14th about this topic. Please reply with your
> >                     availability by Sept 6th if your interested.
> >
> >                     1:
> https://s.apache.org/beam-bundles-backlog-splitting
> >                     2: https://s.apache.org/beam-breaking-fusion
> >
> >                     On Mon, Aug 13, 2018 at 10:21 AM Jean-Baptiste
> >                     Onofré <[email protected] <mailto:[email protected]>>
> wrote:
> >
> >                         Awesome !
> >
> >                         Thanks Luke !
> >
> >                         I plan to work with you and others on this one.
> >
> >                         Regards
> >                         JB
> >                         Le 13 août 2018, à 19:14, Lukasz Cwik
> >                         <[email protected] <mailto:[email protected]>> a
> >                         écrit:
> >
> >                             I wanted to reach out that I will be
> >                             continuing from where Eugene left off with
> >                             SplittableDoFn. I know that many of you have
> >                             done a bunch of work with IOs and/or runner
> >                             integration for SplittableDoFn and would
> >                             appreciate your help in advancing this
> >                             awesome idea. If you have questions or
> >                             things you want to get reviewed related to
> >                             SplittableDoFn, feel free to send them my
> >                             way or include me on anything SplittableDoFn
> >                             related.
> >
> >                             I was part of several discussions with
> >                             Eugene and I think the biggest outstanding
> >                             design portion is to figure out how dynamic
> >                             work rebalancing would play out with the
> >                             portability APIs. This includes reporting of
> >                             progress from within a bundle. I know that
> >                             Eugene had shared some documents in this
> >                             regard but the position / split models
> >                             didn't work too cleanly in a unified sense
> >                             for bounded and unbounded SplittableDoFns.
> >                             It will likely take me awhile to gather my
> >                             thoughts but could use your expertise as to
> >                             how compatible these ideas are with respect
> >                             to to IOs and runners
> >                             Flink/Spark/Dataflow/Samza/Apex/... and
> >                             obviously help during implementation.
> >
>

Reply via email to