[ 
https://issues.apache.org/jira/browse/SAMZA-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15820737#comment-15820737
 ] 

Yi Pan (Data Infrastructure) commented on SAMZA-1073:
-----------------------------------------------------

Added some early discussion materials for this top-level fluent API. So far, 
the main points in the design doc are:
# introduce MessageStreamGraph as the representation of the operator DAG
# kept the MessageStream as programming API class to allow programmers to build 
DAG
# introduce MessageStreamApplication class as abstract template that user will 
implement initGraph() to define the DAG
# introduce ExecutionEnvironment to carry out the execution of the 
MessageStreamGraph (i.e. separate the physical deployment from the logic 
description of DAG)

Some user code examples are provided 
[here|https://github.com/nickpan47/samza/blob/stream-graph-no-spec/samza-operator/src/main/java/org/apache/samza/operators/StreamOperatorAdaptorTask.java]

As for the scope of this JIRA, we will pursue stage-1 mentioned in SAMZA-1041, 
i.e. single job for the whole operator DAG. Multi-stage physical jobs should be 
the responsibility of the ExecutionEnvironment and not included in the scope of 
this ticket.

> Adding through and StreamSpec to the fluent APIs operators
> ----------------------------------------------------------
>
>                 Key: SAMZA-1073
>                 URL: https://issues.apache.org/jira/browse/SAMZA-1073
>             Project: Samza
>          Issue Type: Sub-task
>            Reporter: Yi Pan (Data Infrastructure)
>            Assignee: Yi Pan (Data Infrastructure)
>         Attachments: SAMZA-1073operator-multi-stagejob-levelprogrammingAPI.pdf
>
>
> It would be nice to allow users to stay at logic level when using fluent 
> API's operators, w/o concerning about physical partitions of the stream and 
> potential grouping of operators into multiple / single Samza jobs 
> (SAMZA-1041).
> Hence, the fluent API needs to be able to express the physical topics as 
> boundaries between stages in the single logic DAG.
> Besides, users should be able to use fluent API to describe a logic 
> expression at top level, not within a job or within a task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to