[
https://issues.apache.org/jira/browse/PIG-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555391#comment-13555391
]
Jonathan Coveney commented on PIG-3083:
---------------------------------------
Arnab,
This is good stuff.
I would definitely recommend at least writing some script level tests that use
the new feature sooner than later... especially when dealing with parser and
syntax stuff, it is a good way to serve as a reference for the scope of the
changes you want to make.
As far as 1, I think we want it to go even further than that...if you end up
with
a::b::c::.. and so on, we should allow them to use any of the nested relation
names that they want.
On a higher level note, I am curious what your thoughts are as far as
implementing it in the way you did. It's an interesting tack to make the
relation::* into a relational operator. Another option is to expand relation::*
at parse time into relation::col1, relation::col2, relation::col3. This would
circumvent the issues around column pruning. I am open to different
implementations and we can definitely work through any optimizer issues, just
curious about clarifying the benefits of this approach as an expansion in
parser seems much simpler to me (though I suppose you could argue that it is
making the parser more complicated, which is undesirable).
> Introduce new syntax that let's you project just the columns that come from a
> given :: prefix
> ---------------------------------------------------------------------------------------------
>
> Key: PIG-3083
> URL: https://issues.apache.org/jira/browse/PIG-3083
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.12
> Reporter: Jonathan Coveney
> Labels: PIG-3078
> Fix For: 0.12
>
> Attachments: pig_jira_aguin_3083.patch
>
>
> This is basically a more refined approach than PIG-3078, but it is also more
> work. That JIRA is more of a stopgap until we do something like this.
> The idea would be to support something like the following:
> a = load 'a' as (x,y,z);
> b = load 'b' as (x,y,z);
> c = join a by x, b by x;
> d = foreach c generate a::*;
> Obviously this is useful for any case where you have relations with columns
> with various prefixes.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira