[
https://issues.apache.org/jira/browse/PIG-390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12627261#action_12627261
]
Arthur Zwiegincew commented on PIG-390:
---------------------------------------
Here's a workaround I'm using:
package com.cooliris.analytics;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.DataBag;
import org.apache.pig.data.Tuple;
/**
* Implements a UNIONALL Pig function. It accepts a tuple of the format
<unused, {bag-1}, {bag-2}, {bag-3}, ...>
* and outputs a set of tuples corresponding to UNION bag-1, bag-2, bag-3, ...
. This is intended as a workaround
* to bug PIG-390 — Union doesn't work.
*
* Instead of:
* combined = UNION data1, data2, data3;
* ...do the following:
* cg_combined = COGROUP data1 BY 1, data2 BY 1, data3 BY 1;
* combined = FOREACH cg_combined GENERATE
FLATTEN(com.cooliris.analytics.UNIONALL(*));
*
* @author [EMAIL PROTECTED]
*/
public class UNIONALL extends EvalFunc<DataBag> {
@Override
public void exec(Tuple input, DataBag output) throws IOException {
for (int i = 1; i < input.arity(); ++i) {
for (Tuple nested : input.getBagField(i)) {
output.add(nested);
}
}
}
}
> Union doesn't work
> ------------------
>
> Key: PIG-390
> URL: https://issues.apache.org/jira/browse/PIG-390
> Project: Pig
> Issue Type: Bug
> Environment: Mac OS X
> Reporter: Arthur Zwiegincew
>
> data files:
> $ cat ~/tmp/data
> 1 1
> 2 1
> 3 10
> $ cat ~/tmp/data-2
> 4 20
> 5 20
> pig script:
> data = load '/Users/arthur/tmp/data' as (x, y);
> data2 = load '/Users/arthur/tmp/data-2' as (x, y);
> both = union data, data2;
> dump both;
> result:
> (4, 20)
> (5, 20)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.