[ https://issues.apache.org/jira/browse/PIG-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12892892#action_12892892 ]
Thejas M Nair commented on PIG-1461: ------------------------------------ The syntax - " union ... using 'merge' " introduces a new use of the " using '...' " clause. So far this clause has been used to indicate the implementation algorithm and it did not have any impact on the semantics. Instead a key word might be better if we are trying to avoid introducing another top level operator, similar to the case of outer joins - eg - " union onschema L1, L2;" More suggestions/opinions on the syntax for this feature are welcome. > support union operation that merges based on column names > --------------------------------------------------------- > > Key: PIG-1461 > URL: https://issues.apache.org/jira/browse/PIG-1461 > Project: Pig > Issue Type: New Feature > Components: impl > Affects Versions: 0.8.0 > Reporter: Thejas M Nair > Assignee: Thejas M Nair > Fix For: 0.8.0 > > > When the data has schema, it often makes sense to union on column names in > schema rather than the position of the columns. > The behavior of existing union operator should remain backward compatible . > This feature can be supported using either a new operator or extending union > to support 'using' clause . I am thinking of having a new operator called > either unionschema or merge . Does anybody have any other suggestions for the > syntax ? > example - > L1 = load 'x' as (a,b); > L2 = load 'y' as (b,c); > U = unionschema L1, L2; > describe U; > U: {a:bytearray, b:byetarray, c:bytearray} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.