That's a great question! Beam is about building an excellent programming model -- one that's unified for batch and streaming use cases, enables efficient execution, and is portable across multiple runtimes.
So Beam is neither the intersection of the functionality of all the engines (too limited!) nor the union (too much of a kitchen sink!). Instead, Beam tries to be at the forefront of where data processing is going, both pushing functionality into and pulling patterns out of the runtime engines. State [1] is a great example of functionality that existed in various engines and enabled interesting and common use cases, but wasn't originally expressible in Beam. We recently expanded the Beam model to include a version of this functionality according to Beam's design principles [2]. And vice versa, we hope that Beam will influence the roadmaps of various engines as well. For example, the semantics of Flink's DataStreams were influenced [3] by the Beam (née Dataflow) model. This also means that the capabilities will not always be exactly the same across different Beam runners. So that's why we're using capability matrix [4] to try to clearly communicate the state of things. Hope that helps, Frances [1] https://beam.apache.org/blog/2017/02/13/stateful-processing.html [2] https://beam.apache.org/contribute/design-principles/ [3] http://www.zdnet.com/article/going-with-the-stream- unbounded-data-processing-with-apache-flink/ [4] https://beam.apache.org/documentation/runners/capability-matrix/ On Tue, Feb 21, 2017 at 7:22 PM, Tang Jijun(上海_技术部_数据平台_唐觊隽) < [email protected]> wrote: > I found a case. After submit a spark app, we can stop or getState by > JavaStreamingContext. But use beam api,we can’t stop or getState for > pipeline. I think should add stop and getState method in PipelineRunner. > > > > *发件人:* James [mailto:[email protected]] > *发送时间:* 2017年2月22日 9:50 > *收件人:* [email protected] > *主题:* Is it possible that a feature which the underlying engine (e.g. > Spark) supports, but cann't be expressed using Beam API? > > > > Is it possible that a feature which the underlying engine (e.g. Spark) > supports, but cann't be expressed using Beam API? > > If there is really such a case, how to handle it? (we are planning to use > Beam as the data processing API, but have this concern here.) > > > > Thanks in advance. >
