You can't really get away from string comparisons:
B = FOREACH A GENERATE $0 AS v1; C = FOREACH A GENERATE $0 AS v2; D = CROSS B, C; dump D; -- generates all permutations of the two key fields. E = foreach D generate (v1<v2?v1:v2) as v1, (v1<v2?v2:v1) as v2; F = distinct E; dump F; -- results in combinations However, I think I can see a problem with this as well. If all 'A's are not distinct, then you might need to generate unique Id for each row B = FOREACH A GENERATE $0 AS v1, sequential() as extra1; C = FOREACH A GENERATE $0 AS v2, sequential() as extra2; D = CROSS B, C; D = filter D by extra==extra2; E = foreach D generate (v1<v2?v1:v2) as v1, (v1<v2?v2:v1) as v2; F = distinct E; This gives the actual results if you are solving the combinatoric problem of 5 "A's" 6 "B's" and 7 "C's" how many combinations and permutations. On Sat, Jun 12, 2010 at 6:20 AM, Christian <[email protected]> wrote: > Hello, this is my first contact with Pig and its community ;-) > > I need to generate all the possible permutations from a bag. > > Let me explain it with examples: > > A = LOAD 'data' AS f1:chararray; > > DUMP A; > ('A') > ('B') > ('C') > > I can have all the possible combinations easily with CROSS: > > B = FOREACH A GENERATE $0 AS v1; > C = FOREACH A GENERATE $0 AS v2; > > D = CROSS B, C; > DUMP D; > ('A', 'A') > ('A', 'B') > ('A', 'C') > ('B', 'A') > ('B', 'B') > ('B', 'C') > ('C', 'A') > ('C', 'B') > ('C', 'C') > > But what I need are the permutations. The result I want to obtain is > something like: > > DUMP R; > ('A', 'A') > ('A', 'B') > ('A', 'C') > ('B', 'B') > ('B', 'C') > ('C', 'C') > > My first idea to solve that was to generate de CROSS and then FILTER like: > > R = FILTER D BY $0 < $1; > > It works but I would like to know if there is a better way to do this > without having to use string comparison and assume that only one field is > used. For example a real scenario would look like: > > DUMP A; > ('A1', 'A2') > ('B1', 'B2') > > DUMP R; > ('A1', 'A2', 'A1', 'A2') > ('A1', 'A2', 'B1', 'B2') > ('B1', 'B2', 'B1', 'B2') > > Thank you in advance. > Christian >
