Another API that could be a reference is
http://storm.apache.org/documentation/Trident-API-Overview.html

On Wed, Dec 23, 2015 at 3:09 PM, Ashwin Chandra Putta <
[email protected]> wrote:

> // Made few edits, ignore previous mail. Read this instead.
>
> David,
>
> I can imagine that it boils down to something like these function calls.
>
> DtString lines = readLines(new LineReader());
> DtString words = lines.split(new LineSplitter());
> DtNumber count  = words.count(new Counter());
> count.print(new ConoleOutputOperator());
>
> Or
>
> readLines(new LineReader())
>   .split(new LineSplitter())
>   .count(new Counter())
>   .print(new ConoleOutputOperator());
>
>
> which translates to
>
> Reader reader = dag.addOperator("ReadLines", new LineReader());
> Splitter splitter = dag.addOperator("Split", new LineSplitter());
> Counter counter = dag.addOperator("Count", new WordCounter());
> ConsoleOutputOperator console = dag.addOperator("Print", new
> ConsoleOutputOperator());
>
> dag.addStream("lines", reader.output, splitter.input);
> dag.addStream("words", splitter.output, counter.input);
> dag.addStream("count", counter.output, console.input);
>
> Here are my initial thoughts:
>
> For the higher level api to work, we need the following support at least.
>
> 1. The operators used in the higher level api should have concrete
> implementations with all available input and output ports defined at the
> abstract level. Have to think more about how multiple output ports will
> play out.
> 2. We need to define the objects that have method calls available on them
> that take operators as parameters.
>
> Eg: DtString can have method split and takes Splitter operator. And
> Splitter operator should be abstract with input port type DtString and
> output port type DtString. LineSplitter will be a concrete implementation
> of this operator.
>
> Regards,
> Ashwin.
>
> On Wed, Dec 23, 2015 at 3:07 PM, Ashwin Chandra Putta <
> [email protected]> wrote:
>
> > David,
> >
> > I can imagine that it boils down to something like these function calls.
> >
> > DtString lines = readLines(new LineReader());
> > DtString words = lines.split(new LineSplitter());
> > DtNumber count  = words.count(new Counter());
> > count.print(new ConoleOutputOperator());
> >
> > Or
> >
> > readLines(new LineReader())
> >   .split(new LineSplitter())
> >   .count(new Counter())
> >   .print(new ConoleOutputOperator());
> >
> >
> > which translates to
> >
> > Reader reader = dag.addOperator("ReadLines", new LineReader());
> > LineSplitter splitter = dag.addOperator("Split", new LineSplitter());
> > WordCounter counter = dag.addOperator("Count", new Counter());
> > ConsoleOutputOperator console = dag.addOperator("Print", new
> > ConsoleOutputOperator());
> >
> > dag.addStream("lines", reader.output, splitter.input);
> > dag.addStream("words", splitter.output, counter.input);
> > dag.addStream("count", counter.output, console.input);
> >
> > Here are my initial thoughts:
> >
> > For the higher level api to work, we need the following support at least.
> >
> > 1. The operators used in the higher level api should have concrete
> > implementations with all available input and output ports defined at the
> > abstract level. Have to think more about how multiple output ports will
> > play out.
> > 2. We need to define the objects that have method calls available on them
> > that take operators as parameters.
> >
> > Eg: DtString can have method split and takes Splitter operator. And
> > Splitter operator should be abstract with input port type DtString and
> > output port type DtString. LineSplitter will be a concrete implementation
> > of this operator.
> >
> > Regards,
> > Ashwin.
> >
> > On Wed, Dec 23, 2015 at 1:42 PM, David Yan <[email protected]>
> wrote:
> >
> >> Hi fellow Apex developers:
> >>
> >> Apex has a comprehensive API for constructing DAG topologies for
> streaming
> >> applications, using operators, ports and streams.  But this may seem too
> >> much for folks who just want to build simple applications, or just to
> >> learn
> >> about Apex.  For example, when you compare the code to do word count in
> >> Apex with Spark Streaming or Flink, Apex requires much more code.
> >>
> >> Apex:
> >>
> >>
> https://github.com/apache/incubator-apex-malhar/tree/master/demos/wordcount/src/main
> >>
> >> Spark Streaming:
> >>
> >>
> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/streaming/JavaNetworkWordCount.java
> >>
> >> Flink:
> >>
> >>
> https://github.com/apache/flink/blob/master/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/wordcount/WordCount.java
> >>
> >> Note that their Scala versions are even simpler to use.
> >>
> >> The high-level requirements I have in mind is as follow:
> >>
> >> 1. A simple-to-use high-level API similar to what Spark Streaming and
> >> Flink
> >> have. And from the high-level API, the Apex engine will construct the
> >> actual DAG topology at launch time.
> >>
> >> 2. The first language we will support is Java, but we will also want to
> >> support Scala and possibly Python at some point, so the high-level API
> >> should make it easy for implementing bindings for at least these two
> >> languages.
> >>
> >> 3. We should be able to use the high-level API in Apex App Package (apa)
> >> file, so that dtcli can launch it just like a regular apa today.
> >>
> >> Please provide your ideas and thoughts on this topic.
> >>
> >> Thanks,
> >>
> >> David
> >>
> >
> >
> >
> > --
> >
> > Regards,
> > Ashwin.
> >
>
>
>
> --
>
> Regards,
> Ashwin.
>

Reply via email to