Hello, this is my first contact with Pig and its community ;-)
I need to generate all the possible permutations from a bag.
Let me explain it with examples:
A = LOAD 'data' AS f1:chararray;
DUMP A;
('A')
('B')
('C')
I can have all the possible combinations easily with CROSS:
B = FOREACH A GENERATE $0 AS v1;
C = FOREACH A GENERATE $0 AS v2;
D = CROSS B, C;
DUMP D;
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'B')
('B', 'C')
('C', 'A')
('C', 'B')
('C', 'C')
But what I need are the permutations. The result I want to obtain is
something like:
DUMP R;
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'B')
('B', 'C')
('C', 'C')
My first idea to solve that was to generate de CROSS and then FILTER like:
R = FILTER D BY $0 < $1;
It works but I would like to know if there is a better way to do this
without having to use string comparison and assume that only one field is
used. For example a real scenario would look like:
DUMP A;
('A1', 'A2')
('B1', 'B2')
DUMP R;
('A1', 'A2', 'A1', 'A2')
('A1', 'A2', 'B1', 'B2')
('B1', 'B2', 'B1', 'B2')
Thank you in advance.
Christian