I think what he wants is a transitive closure of the relation, which is not achievable in SQL-like languages alone (first order logic expressive power). I suppose Pig Latin falls in this category.
-- Gianmarco On Thu, Jun 10, 2010 at 19:54, hc busy <[email protected]> wrote: > Is this like a tricky interview question? I don't see the pattern between > those three numbers you listed and the sample of the table. > > 770011 770083 524 1e-120 89 12 > 770083 770011 494 1e-120 39 100 > > ahh, I guess these are related because id1=id2 an id2=id1... Here's a first > pass at the problem. Project: > > P1 = foreach table generate id1 as id1, id2 as id2, *; > P2 = foreach table generate id2 as id1, id1 as id2, *; > J = join P1 by (id1, id2), P2 by (id1,id2); > > and now J contains pairs of rows from original table where id1 and id2 are > reversed. > > is this what you want? > > On Wed, Jun 9, 2010 at 6:54 PM, Renato Marroquín Mogrovejo < > [email protected]> wrote: > >> Hi everyone, today I came across with a particular query that I don't know >> how to model in PIG. Part of my data looks like this: >> >> Id1 Id2 Sc Va P1 P2 >> --------- --------- ----- --------- ----- ---- >> 770011 990201 401 1e-125 100 65 >> 990201 770011 440 1e-125 100 42 >> 770011 770083 524 1e-120 89 12 >> 770083 770011 494 1e-120 39 100 >> 990201 770083 341 1e-125 73 41 >> 770083 990201 421 1e-125 90 85 >> . >> . >> . >> >> what I would like to retrieve is something like >> this: 770011 990201 770083 >> because they are records actually related. >> Any kind of ideas are highly appreciated. Thanks in advanced. >> >> Renato M. >> >
