Actually I wanted to do left outer join, so not sure if filter will work in this case.
On Thu, Jul 5, 2012 at 12:43 PM, Alan Gates <[email protected]> wrote: > Pig can only do equi-joins. Theta joins are hard in MapReduce. So the > way to do this is do the equi-join and then filter afterwards. This will > not create significant additional cost since the join results will be > filtered before being materialized to disk. > > C = Join table_a on user_id, title_id, table_b on user_id, title_id; > D = filter C by table_a::timestamp > table_b::timestamp; > > Alan. > > On Jul 5, 2012, at 12:21 PM, sonia gehlot wrote: > > > Hi Guys, > > > > I want to join 2 tables in hive on couple of columns and out them one > > condition is timestamp of one column is greater then the other one. In > SQL > > I could have written in this way: > > > > table_a a Join table_b b > > on a.user_id = b.user_id > > and a.title_id = b.title_id > > and a.timestamp > b.timestamp > > > > How to write last condition in Pig? *a.timestamp > b.timestamp* > > > > Thanks, > > Sonia > >
