Hi Beam Dev community,

I wanted to circle-back on a recent Beam feature, Display Data, which we
proposed back in March [1] and is now implemented in the Beam SDK. Display
Data provides a method for Runners to collect additional metadata about a
pipeline during construction, suitable for display in UI. PipelineOptions,
PTransforms, and user-defined function types (DoFn, CombineFn, WindowFn)
will register their display data, and the SDK hooks are provided for users
to integrate display data from their own components.

Alex Amato and I wrote a blog post describing how Google Dataflow is now
surfacing display data in its monitoring interface [2]. I encourage other
Runner authors to take a look and consider how display data could fit into
your runner. Integrating display data is relatively straightforward as most
of the heavy-lifting is done in the SDK. The Dataflow Runner collects
display data during pipeline translation for PipelineOptions [3] and
PTransforms [4].

Please have a look at the display data API docs [5] and let me know if you
have any questions.

- Scott

[1]
http://mail-archives.apache.org/mod_mbox/incubator-beam-dev/201603.mbox/raw/%3CCAN-7FgbR%3DyXPHZj-GrPO3aGSkkj11NXwAoyOGEzWc9r3ApnOpg%40mail.gmail.com%3E/1
[2]
https://cloud.google.com/blog/big-data/2016/06/dataflow-updates-see-more-details-about-your-pipelines
[3]
https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L406
[4]
https://github.com/apache/incubator-beam/blob/7d767056a90e769eff68d4347e1b3a7bc43f415c/runners/google-cloud-dataflow-java/src/main/java/org/apache/beam/runners/dataflow/DataflowPipelineTranslator.java#L548
[5]
http://beam.incubator.apache.org/javadoc/0.1.0-incubating/org/apache/beam/sdk/transforms/display/HasDisplayData.html#populateDisplayData-org.apache.beam.sdk.transforms.display.DisplayData.Builder-

Reply via email to