Running a pipeline on Dataflow I noticed it was not showing the 'display data' of ParquetIO on the Dataflow UI, after digging deeper I found that composite transforms are not shown on Dataflow.
BEAM-366 Support Display Data on Composite Transforms https://issues.apache.org/jira/browse/BEAM-366 I also noticed that for primitive transforms what is shown is not the populateDisplayData code extended from PTransform but the populateDisplayData method code implemented at the parametrizing function level, concretely the DoFn or Source for the case of IOs. This of course surprised me because we have been implemented all these methods in the wrong place (at the PTransform level) for years and ignoring the function so they are not shown in the UI, so I was wondering: 1. Does Google plan to support displaying composite transforms (BEAM-366) at some point? 2. If (1) is not happening soon, shall we refine all our populateDisplayData implementations to be done only at the Function level (DoFn, Source, WindowFn)? Since Open Source runners (Flink, Spark, etc) do not use DisplayData at all I suppose we should keep this discussion at the Dataflow level only at this time. I ignore how this is modeled on Portable Pipelines, is DisplayData part of FunctionSpec to support the current use case? I saw that DisplayData is considered at the PTransform level so this should cover the Composite case, so I am curious if we are considering the parametrized function level currently in use correctly for Portable pipelines.
