Hi Amit,
Yes, this is correct. Part of the motivation for this is that DoFn API is
user-facing, and the compressed representation of windowed elements (e.g.
access to all windows of an element), as well as the ability to emit
directly into a specified window, is an implementation detail of the run
You've got it right. My recommendations is to just directly implement it
for the Spark runner. It will often actually clean things up a bit. Here's
the analogous change for the Flink runner:
https://github.com/apache/incubator-beam/pull/1435/files.
With GABW, I tried going through the process of k
So basically using new DoFn with *SplittableParDo*, *Window.Bound*, and
*GroupAlsoByWindow* requires a custom implementation by per runner as they
are not handled by DoFn anymore, right ?
On Sun, Dec 11, 2016 at 3:42 PM Eugene Kirpichov
wrote:
> Hi Amit, I'll comment in more detail later, but me
Hi Amit, I'll comment in more detail later, but meanwhile please take a
look at https://github.com/apache/incubator-beam/pull/1565
There is a small amount of relevant changes to spark runner.
Take a look at implementation of SplittableParDo (already committed) in
particular ProcessFn and it's usage
Hi all,
I've been working on migrating the Spark runner to new DoFn and I've
stumbled upon a couple of cases where OldDoFn is used in a way that
accessed windowInternals (outputWindowedValue) such as AssignWindowsDoFn.
Since changing windows is no longer the responsibility of DoFn I was
wondering