[
https://issues.apache.org/jira/browse/TEZ-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13887590#comment-13887590
]
Bikas Saha commented on TEZ-678:
--------------------------------
Comments fixed in next patch except for the following.
bq.DAG.createAlias seems a little out of place, why should the DAG provide a
static Alias creator. Could just let users instantiate AliasVertex / VertexGroup
bq. DAG.addVertexGroup(..) would be a useful API to have (and avoid adding
vertices individually)
Vertices can be independently handled and while some operations may be
performed as part of a group. Vertices may belong to multiple groups. So adding
member vertices via DAG.addVertexGroup() does not make sense in general.
DAG.createVertexGroup(Vertices) is not a static method. It is called on the DAG
object to create a group of its vertices.
bq. The transition change - moving to TERMINATING - the DAG may end up staying
in this state and not moving to FAILED/KILLED.
This is following the same logic as an existing transition.
checkStateForCompletion() returns whether further completions are expected or
not and so by seeing we decide if to wait in TERMINATING state or to
immediately go to a final state.
bq. VertexState.COMMIT_FAILURE needs to be handled in VertexImpl - this will
likely put the DAG into an ERROR state.
Did not quite get this. There is no such VertexState and COMMIT_FAILURE is a
termination reason for diagnostics.
> Support for union operations
> ----------------------------
>
> Key: TEZ-678
> URL: https://issues.apache.org/jira/browse/TEZ-678
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Bikas Saha
> Assignee: Bikas Saha
> Attachments: TEZ-678.1.patch, TEZ-678.2.patch, TEZ-678.3.patch,
> TEZ-678.4.patch, TEZ-678.5.patch, TEZ-678.6.patch, TEZ-678.7.patch,
> TEZ-678.8.patch
>
>
> Unions represent a collection of results obtained from different branches of
> computation. The collection is a virtual operation that does not need to
> execute any tasks. Subsequent operations can conveniently work on the union
> named data set instead of each individual member of the union. While unions
> can be implemented efficiently without additional support from Tez, having
> API support can make it easier and less error-prone to implement.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)