Another API that could be a reference is http://storm.apache.org/documentation/Trident-API-Overview.html
On Wed, Dec 23, 2015 at 3:09 PM, Ashwin Chandra Putta < [email protected]> wrote: > // Made few edits, ignore previous mail. Read this instead. > > David, > > I can imagine that it boils down to something like these function calls. > > DtString lines = readLines(new LineReader()); > DtString words = lines.split(new LineSplitter()); > DtNumber count = words.count(new Counter()); > count.print(new ConoleOutputOperator()); > > Or > > readLines(new LineReader()) > .split(new LineSplitter()) > .count(new Counter()) > .print(new ConoleOutputOperator()); > > > which translates to > > Reader reader = dag.addOperator("ReadLines", new LineReader()); > Splitter splitter = dag.addOperator("Split", new LineSplitter()); > Counter counter = dag.addOperator("Count", new WordCounter()); > ConsoleOutputOperator console = dag.addOperator("Print", new > ConsoleOutputOperator()); > > dag.addStream("lines", reader.output, splitter.input); > dag.addStream("words", splitter.output, counter.input); > dag.addStream("count", counter.output, console.input); > > Here are my initial thoughts: > > For the higher level api to work, we need the following support at least. > > 1. The operators used in the higher level api should have concrete > implementations with all available input and output ports defined at the > abstract level. Have to think more about how multiple output ports will > play out. > 2. We need to define the objects that have method calls available on them > that take operators as parameters. > > Eg: DtString can have method split and takes Splitter operator. And > Splitter operator should be abstract with input port type DtString and > output port type DtString. LineSplitter will be a concrete implementation > of this operator. > > Regards, > Ashwin. > > On Wed, Dec 23, 2015 at 3:07 PM, Ashwin Chandra Putta < > [email protected]> wrote: > > > David, > > > > I can imagine that it boils down to something like these function calls. > > > > DtString lines = readLines(new LineReader()); > > DtString words = lines.split(new LineSplitter()); > > DtNumber count = words.count(new Counter()); > > count.print(new ConoleOutputOperator()); > > > > Or > > > > readLines(new LineReader()) > > .split(new LineSplitter()) > > .count(new Counter()) > > .print(new ConoleOutputOperator()); > > > > > > which translates to > > > > Reader reader = dag.addOperator("ReadLines", new LineReader()); > > LineSplitter splitter = dag.addOperator("Split", new LineSplitter()); > > WordCounter counter = dag.addOperator("Count", new Counter()); > > ConsoleOutputOperator console = dag.addOperator("Print", new > > ConsoleOutputOperator()); > > > > dag.addStream("lines", reader.output, splitter.input); > > dag.addStream("words", splitter.output, counter.input); > > dag.addStream("count", counter.output, console.input); > > > > Here are my initial thoughts: > > > > For the higher level api to work, we need the following support at least. > > > > 1. The operators used in the higher level api should have concrete > > implementations with all available input and output ports defined at the > > abstract level. Have to think more about how multiple output ports will > > play out. > > 2. We need to define the objects that have method calls available on them > > that take operators as parameters. > > > > Eg: DtString can have method split and takes Splitter operator. And > > Splitter operator should be abstract with input port type DtString and > > output port type DtString. LineSplitter will be a concrete implementation > > of this operator. > > > > Regards, > > Ashwin. > > > > On Wed, Dec 23, 2015 at 1:42 PM, David Yan <[email protected]> > wrote: > > > >> Hi fellow Apex developers: > >> > >> Apex has a comprehensive API for constructing DAG topologies for > streaming > >> applications, using operators, ports and streams. But this may seem too > >> much for folks who just want to build simple applications, or just to > >> learn > >> about Apex. For example, when you compare the code to do word count in > >> Apex with Spark Streaming or Flink, Apex requires much more code. > >> > >> Apex: > >> > >> > https://github.com/apache/incubator-apex-malhar/tree/master/demos/wordcount/src/main > >> > >> Spark Streaming: > >> > >> > https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java > >> > >> Flink: > >> > >> > https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java > >> > >> Note that their Scala versions are even simpler to use. > >> > >> The high-level requirements I have in mind is as follow: > >> > >> 1. A simple-to-use high-level API similar to what Spark Streaming and > >> Flink > >> have. And from the high-level API, the Apex engine will construct the > >> actual DAG topology at launch time. > >> > >> 2. The first language we will support is Java, but we will also want to > >> support Scala and possibly Python at some point, so the high-level API > >> should make it easy for implementing bindings for at least these two > >> languages. > >> > >> 3. We should be able to use the high-level API in Apex App Package (apa) > >> file, so that dtcli can launch it just like a regular apa today. > >> > >> Please provide your ideas and thoughts on this topic. > >> > >> Thanks, > >> > >> David > >> > > > > > > > > -- > > > > Regards, > > Ashwin. > > > > > > -- > > Regards, > Ashwin. >
