hi, the join key is in the bag, thats the problem. The Load Function returns only one element 0$ and that is the map. This map is transformed in the next step with the UDF "MapToBagUDF" into a bag. for example the load functions returns this ([col1,col2,col3), then this map inside the tuple is transformed to:
(col1) (col2) (col3) Maybe there is is way to transform the map directly in the load function into a bag? The problem I see is that the next() Method in the LoadFunc has to be a Tuple and no Bag. :/ 2013/9/13 Pradeep Gollakota <[email protected]> > Since your join key is not in the Bag, can you do your join first and then > execute your UDF? > > > On Fri, Sep 13, 2013 at 10:04 AM, John <[email protected]> wrote: > > > Okay, I think I have found the problem here: > > http://pig.apache.org/docs/r0.11.1/perf.html#merge-joins ... there is > > wirtten; > > > > There may be filter statements and foreach statements between the sorted > > data source and the join statement. The foreach statement should meet the > > following conditions: > > > > - There should be no UDFs in the foreach statement. > > - The foreach statement should not change the position of the join > keys. > > - There should be no transformation on the join keys which will change > > the sort order. > > > > > > I have to use a UDF to transform the Map into a Bag ... any Workaround > > idea? > > > > thanks > > > > > > 2013/9/13 John <[email protected]> > > > > > Hi, > > > > > > I try to use a merge join for 2 bags. Here is my pig code: > > > http://pastebin.com/Y9b2UtNk . > > > > > > But I got this error: > > > > > > Caused by: > > > > > > org.apache.pig.backend.hadoop.executionengine.physicalLayer.LogicalToPhysicalTranslatorException: > > > ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, Ascending > > > Sort, or Load as its predecessors. Found > > > > > > I think the reason is that there is no sort function or something like > > > this. But the bags are definitely sorted. How can I do the merge join? > > > > > > thanks > > > > > >
