[ 
https://issues.apache.org/jira/browse/TEZ-678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880537#comment-13880537
 ] 

Siddharth Seth commented on TEZ-678:
------------------------------------

bq. AliasVertex extending Vertex is very convenient because its modeling a 
virtual union vertex but with restrictions .....
It may be convenient, not sure it's correct though - is 'AliasVertex' really a 
vertex, or just a convenience grouping of Vertices which require the same 
operation. It doesn't actually form any part of the graph - the grouped 
vertices however are hooked into the graph. The only part which is used from 
Vertex is addOutputs - everything else is either new functionality (alias, ID) 
or is not required / supported. 

bq. A vertex can participate in multiple aliases. An alias can have multiple 
outputs though I dont see any real use cases which would require that. Same for 
multiple edges .....
I believe Hive can make use of this for multi-inserts. In terms of ease of use 
- I'd definitely prefer creating a single group and hooking it up to multiple 
edges, rather than have to create the same group multiple times over. If adding 
a GroupedInput - that would only apply to edges which have a VertexGroup on 
them. Convenience aside, doesn't the Input descriptor really belong to the edge 
?

bq. Only a single output/committer is specified on the alias. So not sure what 
you mean by multiple committers being specified. A single commit is executed at 
runtime.
I must've read this incorrectly. The unit tested seemed to be looking for 2 
invocations of commit.

Yep, we should get this in before 0.3.

> Support for union operations
> ----------------------------
>
>                 Key: TEZ-678
>                 URL: https://issues.apache.org/jira/browse/TEZ-678
>             Project: Apache Tez
>          Issue Type: Improvement
>            Reporter: Bikas Saha
>            Assignee: Bikas Saha
>         Attachments: TEZ-678.1.patch, TEZ-678.2.patch, TEZ-678.3.patch, 
> TEZ-678.4.patch, TEZ-678.5.patch
>
>
> Unions represent a collection of results obtained from different branches of 
> computation. The collection is a virtual operation that does not need to 
> execute any tasks. Subsequent operations can conveniently work on the union 
> named data set instead of each individual member of the union. While unions 
> can be implemented efficiently without additional support from Tez, having 
> API support can make it easier and less error-prone to implement.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to