pig-user  

Re: Union support

Mridul Muralidharan
Mon, 29 Sep 2008 18:37:08 -0700


Hi Arthur,

  Does a store instead of dump work ?
Something like :

-- start --
data = load '/Users/arthur/tmp/data' as (x, y);
data2 = load '/Users/arthur/tmp/data-2' as (x, y);
both = union data, data2;
store both into 'temp' using PigStorage();

cat temp
-- end --



Regards,
Mridul

Arthur Zwiegincew wrote:
I've come across a very basic problem—unions simply do not work in Hadoop
mode.

data files:

$ cat ~/tmp/data
1 1
2 1
3 10

$ cat ~/tmp/data-2
4 20
5 20

pig script:
data = load '/Users/arthur/tmp/data' as (x, y);
data2 = load '/Users/arthur/tmp/data-2' as (x, y);
both = union data, data2;
dump both;

result:
(4, 20)
(5, 20)


I've opened a bug <https://issues.apache.org/jira/browse/PIG-390> on this,
but there has been no response.


Am I missing anything?


Thanks,
Arthur