My first suggestion is we should focus on Stream API(or change the name we call it) for now. High-level API is confusion and could be anything that helps.
Stream is in fact more well-known concept other than Operators, ports, connector, etc. I think the idea originate from scala sequence API( http://www.scala-lang.org/api/current/#scala.collection.Seq) And the term "Stream" already implies some minimal function we need ex. "map"(t1->f->t1'), "reduce" (t1, t2,... -> f -> t1'), "filter" (t1 ->f(if true) -> t1) We shouldn't come up with arbitrary things so the API would become cumbersome and hard to learn. On Wed, Dec 23, 2015 at 3:12 PM, Siyuan Hua <[email protected]> wrote: > Another API that could be a reference is > http://storm.apache.org/documentation/Trident-API-Overview.html > > On Wed, Dec 23, 2015 at 3:09 PM, Ashwin Chandra Putta < > [email protected]> wrote: > >> // Made few edits, ignore previous mail. Read this instead. >> >> David, >> >> I can imagine that it boils down to something like these function calls. >> >> DtString lines = readLines(new LineReader()); >> DtString words = lines.split(new LineSplitter()); >> DtNumber count = words.count(new Counter()); >> count.print(new ConoleOutputOperator()); >> >> Or >> >> readLines(new LineReader()) >> .split(new LineSplitter()) >> .count(new Counter()) >> .print(new ConoleOutputOperator()); >> >> >> which translates to >> >> Reader reader = dag.addOperator("ReadLines", new LineReader()); >> Splitter splitter = dag.addOperator("Split", new LineSplitter()); >> Counter counter = dag.addOperator("Count", new WordCounter()); >> ConsoleOutputOperator console = dag.addOperator("Print", new >> ConsoleOutputOperator()); >> >> dag.addStream("lines", reader.output, splitter.input); >> dag.addStream("words", splitter.output, counter.input); >> dag.addStream("count", counter.output, console.input); >> >> Here are my initial thoughts: >> >> For the higher level api to work, we need the following support at least. >> >> 1. The operators used in the higher level api should have concrete >> implementations with all available input and output ports defined at the >> abstract level. Have to think more about how multiple output ports will >> play out. >> 2. We need to define the objects that have method calls available on them >> that take operators as parameters. >> >> Eg: DtString can have method split and takes Splitter operator. And >> Splitter operator should be abstract with input port type DtString and >> output port type DtString. LineSplitter will be a concrete implementation >> of this operator. >> >> Regards, >> Ashwin. >> >> On Wed, Dec 23, 2015 at 3:07 PM, Ashwin Chandra Putta < >> [email protected]> wrote: >> >> > David, >> > >> > I can imagine that it boils down to something like these function calls. >> > >> > DtString lines = readLines(new LineReader()); >> > DtString words = lines.split(new LineSplitter()); >> > DtNumber count = words.count(new Counter()); >> > count.print(new ConoleOutputOperator()); >> > >> > Or >> > >> > readLines(new LineReader()) >> > .split(new LineSplitter()) >> > .count(new Counter()) >> > .print(new ConoleOutputOperator()); >> > >> > >> > which translates to >> > >> > Reader reader = dag.addOperator("ReadLines", new LineReader()); >> > LineSplitter splitter = dag.addOperator("Split", new LineSplitter()); >> > WordCounter counter = dag.addOperator("Count", new Counter()); >> > ConsoleOutputOperator console = dag.addOperator("Print", new >> > ConsoleOutputOperator()); >> > >> > dag.addStream("lines", reader.output, splitter.input); >> > dag.addStream("words", splitter.output, counter.input); >> > dag.addStream("count", counter.output, console.input); >> > >> > Here are my initial thoughts: >> > >> > For the higher level api to work, we need the following support at >> least. >> > >> > 1. The operators used in the higher level api should have concrete >> > implementations with all available input and output ports defined at the >> > abstract level. Have to think more about how multiple output ports will >> > play out. >> > 2. We need to define the objects that have method calls available on >> them >> > that take operators as parameters. >> > >> > Eg: DtString can have method split and takes Splitter operator. And >> > Splitter operator should be abstract with input port type DtString and >> > output port type DtString. LineSplitter will be a concrete >> implementation >> > of this operator. >> > >> > Regards, >> > Ashwin. >> > >> > On Wed, Dec 23, 2015 at 1:42 PM, David Yan <[email protected]> >> wrote: >> > >> >> Hi fellow Apex developers: >> >> >> >> Apex has a comprehensive API for constructing DAG topologies for >> streaming >> >> applications, using operators, ports and streams. But this may seem >> too >> >> much for folks who just want to build simple applications, or just to >> >> learn >> >> about Apex. For example, when you compare the code to do word count in >> >> Apex with Spark Streaming or Flink, Apex requires much more code. >> >> >> >> Apex: >> >> >> >> >> https://github.com/apache/incubator-apex-malhar/tree/master/demos/wordcount/src/main >> >> >> >> Spark Streaming: >> >> >> >> >> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java >> >> >> >> Flink: >> >> >> >> >> https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java >> >> >> >> Note that their Scala versions are even simpler to use. >> >> >> >> The high-level requirements I have in mind is as follow: >> >> >> >> 1. A simple-to-use high-level API similar to what Spark Streaming and >> >> Flink >> >> have. And from the high-level API, the Apex engine will construct the >> >> actual DAG topology at launch time. >> >> >> >> 2. The first language we will support is Java, but we will also want to >> >> support Scala and possibly Python at some point, so the high-level API >> >> should make it easy for implementing bindings for at least these two >> >> languages. >> >> >> >> 3. We should be able to use the high-level API in Apex App Package >> (apa) >> >> file, so that dtcli can launch it just like a regular apa today. >> >> >> >> Please provide your ideas and thoughts on this topic. >> >> >> >> Thanks, >> >> >> >> David >> >> >> > >> > >> > >> > -- >> > >> > Regards, >> > Ashwin. >> > >> >> >> >> -- >> >> Regards, >> Ashwin. >> > >
