(I'm not familiar with the details of Catalyst itself.) The existing runners (Cloud Dataflow, Spark, Flink) all do optimizations of their own, though it's quite likely there's a set of optimizations that are conceptually shared. For example, something like ParDo fusion is pretty basic to executing the Beam model. However, even that could be tuned very differently depending on the backend you are targeting. So I don't think we should have a shared optimizer for all of Beam. However, if there's a set of graph transformations that are useful to multiple runners, it'd be great to have them written in a general way and put in some kind of runner util package.
Frances On Thu, Feb 18, 2016 at 6:37 PM, lonely Feb <[email protected]> wrote: > Should we have a common optimization framework for BEAM which just same as > Spark Catalyst? Optimization is so significant but it seems that we have no > plans for it? >
