Hi Jason-

It only seems like full outer or full inner joins are supported. I was hoping to just do a left outer join.

Is this supported or planned?


The full inner/outer joins are examples, really. You can define your own operations by extending o.a.h.mapred.join.JoinRecordReader or o.a.h.mapred.join.MultiFilterRecordReader and registering your new identifier with the parser by defining a property "mapred.join.define.<ident>" as your class.

For a left outer join, JoinRecordReader is the correct base. InnerJoinRecordReader and OuterJoinRecordReader should make its use clear.

On the flip side doing the Outer Join is about 8x faster than doing a map/reduce over our dataset.

Cool! Out of curiosity, how are you managing your splits? -C

Reply via email to