Hello, this is my first contact with Pig and its community ;-)

I need to generate all the possible permutations from a bag.

Let me explain it with examples:

A = LOAD 'data' AS f1:chararray;

DUMP A;
('A')
('B')
('C')

I can have all the possible combinations easily with CROSS:

B = FOREACH A GENERATE $0 AS v1;
C = FOREACH A GENERATE $0 AS v2;

D = CROSS B, C;
DUMP D;
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'B')
('B', 'C')
('C', 'A')
('C', 'B')
('C', 'C')

But what I need are the permutations. The result I want to obtain is
something like:

DUMP R;
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'B')
('B', 'C')
('C', 'C')

My first idea to solve that was to generate de CROSS and then FILTER like:

R = FILTER D BY $0 < $1;

It works but I would like to know if there is a better way to do this
without having to use string comparison and assume that only one field is
used. For example a real scenario would look like:

DUMP A;
('A1', 'A2')
('B1', 'B2')

DUMP R;
('A1', 'A2', 'A1', 'A2')
('A1', 'A2', 'B1', 'B2')
('B1', 'B2', 'B1', 'B2')

Thank you in advance.
Christian

Reply via email to