[
https://issues.apache.org/jira/browse/PIG-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Coveney updated PIG-3010:
----------------------------------
Attachment: PIG-3010-0.patch
Here is a patch that does this. The changes are further reaching than they
otherwise might need to be, but this is because this is a good time to
futureproof flatten by using an enum approach instead.
A nice side effect is that you can implement FLATTEN as a UDF (though this
isn't necessarily desirable as it is going to add some overhead...still, the
fact that it _can be done_ is quite powerful). That UDF is
src/org/apache/pig/builtin/UdfFlatten.java
This let's you do a lot of really neat stuff, such as:
{code}
a = load 'data2' as (x:int,y:int);
b = foreach a generate UdfFlatten(x,y);
describe b;
{code}
which results in:
{code}
b: {x: int,y: int}
{code}
Woah! Previously, this was impossible. What happens if you dump? The result is
{code}
(1,10)
(4,11)
(5,10)
{code}
Woah!
You can even do the following:
{code}
a = load 'data2' as (x:int,y:int);
b = foreach a generate UdfFlatten(TOTUPLE(x,y));
dump b;
{code}
And it works for bags as well. The uses are obvious IMHO.
> Allow UDF's to flatten themselves
> ---------------------------------
>
> Key: PIG-3010
> URL: https://issues.apache.org/jira/browse/PIG-3010
> Project: Pig
> Issue Type: Improvement
> Reporter: Jonathan Coveney
> Assignee: Jonathan Coveney
> Fix For: 0.12
>
> Attachments: PIG-3010-0.patch
>
>
> This is something I thought would be cool for a while, so I sat down and did
> it because I think there are some useful debugging tools it'd help with.
> The idea is that if you attach an annotation to a UDF, the Tuple or DataBag
> you output will be flattened. This is quite powerful. A very common pattern
> is:
> a = foreach data generate Flatten(MyUdf(thing)) as (a,b,c);
> This would let you just do:
> a = foreach data generate MyUdf(thing);
> With the exact same result!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira