Hi Max,
that is true, but then we have two orthogonal issues:
a) correctness - if empty bundles are aligned with the model, then
validates runner tests should take that into account
b) performance - that can be dealt with in separate JIRA issue, if needed
WDYT?
Jan
On 10/21/19 3:22 PM, Maximilian Michels wrote:
Hi Jan,
I think it is aligned with the model to create empty bundles. The
question if course, whether it is preferable to avoid them, since the
Setup/Finish state might be costly, depending on the bundle size and
the type of DoFn used.
Cheers,
Max
On 21.10.19 14:13, Kyle Weaver wrote:
Nevermind, this is discussed on the PR linked.
On Mon, Oct 21, 2019 at 2:11 PM Kyle Weaver <kcwea...@google.com
<mailto:kcwea...@google.com>> wrote:
Do you know why an empty bundle might be created?
On Mon, Oct 21, 2019 at 1:42 PM Jan Lukavský <je...@seznam.cz
<mailto:je...@seznam.cz>> wrote:
Hi,
when debugging a flaky ParDoLifecycleTest in FlinkRunner, I have
found a
situation, where Flink might create empty bundle - i.e. call
@StartBundle immediately followed by @FinishBundle, with no
elements
inside the bundle. That is what breaks the ParDoLifecycleTest,
because
the test explicitly assumes, that the sequence of lifecycle
methods
should be StartBundle -> Process Element -> Finish Bundle. It is
easy to
modify the test to accept situation of StartBundle ->
FinishBundle with
no elements ([1]), but the question is, is this allowed by the
model? I
think there is no reason not to be, but I'd like to be sure.
Thanks,
Jan
[1] https://github.com/apache/beam/pull/9841