I like the idea of a/many very simple example(s) of side inputs. There are
existing examples that use side inputs:

$ cd examples/java/src/main/java/org/apache/beam/examples
$ grep -r withSideInput .
./complete/TfIdf.java:                  .withSideInputs(totalDocuments));
./complete/game/GameStats.java:
.withSideInputs(globalMeanScore));
./complete/game/GameStats.java:
.withSideInputs(spammersView))
./cookbook/FilterExamples.java:
.withSideInputs(globalMeanTemp));

>From just this grep It looks like all but one are broadcast scalar values.
I have not looked at them to see if they are too complex or too trivial.

Side inputs _are_ different in streaming as you have to pause the main
input or push back elements until a side input is ready for a window.

I would suggest multiple simple examples each showing one way of using side
inputs. A particular thing to demonstrated might be a triggered
Combine.perKey() and tutorial that it requires a View.asMultimap() because
triggers result in duplicate entries for a key.

Kenn

On Wed, Nov 21, 2018 at 4:40 PM Ruoyun Huang <[email protected]> wrote:

> Hi,
>
> I am working on sideInput support in java reference runner (ULR) JIRA-2928
> [1].
> Although there are inline code snippet example [2] and unit tests [3], I
> did not find
> a good place showing a working example of SideInput(please correct me if I
> am wrong).
> I am thinking of creating one more WordCount example under example folder
> [2].
> In particular, in this example we show variants of a) sideinputs as a
> scalar AND multimap, b) from pipeline data or created within code and c)
> [OPTIONAL?] Streaming versus batch, if there are differences (this I am not
> sure yet).
>
> In the meanwhile, JIRA-2928 can also easily rely on such an example to
> validate behaviors between portable/non-portable runners.
>
> Would like to double check if is this a reasonable idea.
>
> Even though SideInput is just one of our many many features, my
> justification is that, it is commonly used, thus having a one-stop example
> make it easier for new users.  That being said, is there a reason not to
> have yet another WordCount example? (Another idea is to extend existing
> WordCount.java, but that breaks its simplicity.)
>
> If it is a good change to have, any suggestion on what else to include?
>
> Thanks!
>
> [1] https://issues.apache.org/jira/browse/BEAM-2928
> [2]
> sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/ParDo.java#L160
> [3]
> sdks/java/harness/src/test/java/org/apache/beam/fn/harness/state/MultimapSideInputTest.java
> [4] examples/java/src/main/java/org/apache/beam/examples
>
> --
> ================
> Ruoyun  Huang
>
>

Reply via email to