Nice. In terms of shared data structures, we have https://github.com/apache/beam/blob/master/sdks/common/runner-api/src/main/proto/beam_runner_api.proto . Presumably a utility that converts this to a dot file would be quite useful.
It might be interesting to experiment with different ways of handling the nesting. For example, the nodes inside a composite transform need not repeat their common prefix, which could make things more compact. On Thu, Aug 3, 2017 at 9:25 PM, Ahmet Altay <[email protected]> wrote: > +1, this looks great and it will be very useful for users to understand > their pipelines. > > On Thu, Aug 3, 2017 at 8:25 PM, Pei HE <[email protected]> wrote: > >> Hi all, >> While working on JStorm and MapReduce runners, I found that it is very >> helpful to understand Beam pipelines by visualizing them. >> >> Logical graph: >> https://drive.google.com/file/d/0B6iZ7iRh-LOYc0dUS0Rwb2tvWGM/view?usp= >> sharing >> >> Physical graph: >> https://drive.google.com/file/d/0B6iZ7iRh-LOYbDFWeDlCcDhnQmc/view?usp= >> sharing >> >> I think we can visualize Beam logical DAG in runner-core. It should also be >> easy to visualize the physical DAG in each runners. (Maybe we can define >> some shared data structures to make it more automatic, and even support >> visualizing them in Apex/Flink/Spark/Gearpump UIs). >> >> I have a commit for MapReduce runner in here (<200 lines). And, this commit >> generates dotfiles for logical and physical DAGs. >> >> https://github.com/peihe/incubator-beam/commit/ >> bb3349e10c0cfacd81b610880ddfec030fedf34d >> >> Looking forward to ideas and feedbacks. >> -- >> Pei >>
