Yes, the model allows them. It also takes less work to avoid them in general (e.g. imagine one reshuffles N elements to M > N workers. A priori, one would "start" a bundle and then try to read all data destined for that worker--postponing this until one knows that the set of data for this worker could be an optimization (as could not doing so as a form of speculative execution) but should not be necessary.
- Robert On Mon, Oct 21, 2019 at 7:03 AM Jan Lukavský <je...@seznam.cz> wrote: > > Hi Max, > > that is true, but then we have two orthogonal issues: > > a) correctness - if empty bundles are aligned with the model, then > validates runner tests should take that into account > > b) performance - that can be dealt with in separate JIRA issue, if needed > > WDYT? > > Jan > > On 10/21/19 3:22 PM, Maximilian Michels wrote: > > Hi Jan, > > > > I think it is aligned with the model to create empty bundles. The > > question if course, whether it is preferable to avoid them, since the > > Setup/Finish state might be costly, depending on the bundle size and > > the type of DoFn used. > > > > Cheers, > > Max > > > > On 21.10.19 14:13, Kyle Weaver wrote: > >> Nevermind, this is discussed on the PR linked. > >> > >> On Mon, Oct 21, 2019 at 2:11 PM Kyle Weaver <kcwea...@google.com > >> <mailto:kcwea...@google.com>> wrote: > >> > >> Do you know why an empty bundle might be created? > >> > >> On Mon, Oct 21, 2019 at 1:42 PM Jan Lukavský <je...@seznam.cz > >> <mailto:je...@seznam.cz>> wrote: > >> > >> Hi, > >> > >> when debugging a flaky ParDoLifecycleTest in FlinkRunner, I have > >> found a > >> situation, where Flink might create empty bundle - i.e. call > >> @StartBundle immediately followed by @FinishBundle, with no > >> elements > >> inside the bundle. That is what breaks the ParDoLifecycleTest, > >> because > >> the test explicitly assumes, that the sequence of lifecycle > >> methods > >> should be StartBundle -> Process Element -> Finish Bundle. It is > >> easy to > >> modify the test to accept situation of StartBundle -> > >> FinishBundle with > >> no elements ([1]), but the question is, is this allowed by the > >> model? I > >> think there is no reason not to be, but I'd like to be sure. > >> > >> Thanks, > >> > >> Jan > >> > >> [1] https://github.com/apache/beam/pull/9841 > >>