[
https://issues.apache.org/jira/browse/SAMZA-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15820737#comment-15820737
]
Yi Pan (Data Infrastructure) commented on SAMZA-1073:
-----------------------------------------------------
Added some early discussion materials for this top-level fluent API. So far,
the main points in the design doc are:
# introduce MessageStreamGraph as the representation of the operator DAG
# kept the MessageStream as programming API class to allow programmers to build
DAG
# introduce MessageStreamApplication class as abstract template that user will
implement initGraph() to define the DAG
# introduce ExecutionEnvironment to carry out the execution of the
MessageStreamGraph (i.e. separate the physical deployment from the logic
description of DAG)
Some user code examples are provided
[here|https://github.com/nickpan47/samza/blob/stream-graph-no-spec/samza-operator/src/main/java/org/apache/samza/operators/StreamOperatorAdaptorTask.java]
As for the scope of this JIRA, we will pursue stage-1 mentioned in SAMZA-1041,
i.e. single job for the whole operator DAG. Multi-stage physical jobs should be
the responsibility of the ExecutionEnvironment and not included in the scope of
this ticket.
> Adding through and StreamSpec to the fluent APIs operators
> ----------------------------------------------------------
>
> Key: SAMZA-1073
> URL: https://issues.apache.org/jira/browse/SAMZA-1073
> Project: Samza
> Issue Type: Sub-task
> Reporter: Yi Pan (Data Infrastructure)
> Assignee: Yi Pan (Data Infrastructure)
> Attachments: SAMZA-1073operator-multi-stagejob-levelprogrammingAPI.pdf
>
>
> It would be nice to allow users to stay at logic level when using fluent
> API's operators, w/o concerning about physical partitions of the stream and
> potential grouping of operators into multiple / single Samza jobs
> (SAMZA-1041).
> Hence, the fluent API needs to be able to express the physical topics as
> boundaries between stages in the single logic DAG.
> Besides, users should be able to use fluent API to describe a logic
> expression at top level, not within a job or within a task.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)