Hi all,

I used the following in the project

JOIN a1 BY xxx LEFT OUTER, a2 BY xxxx USING 'replicated'


after loading a large file into a2, I hit out-of-memory.

The Pig Latin doc says that the replidated join is to put the right-hand
side table into the memory for each mapper, allowing the join computed
without reducers.

1. May I resolve this by increaing the heap size ? is it
mapred.child.java.opts ?

2. As the input file is becoming larger and larger, I think the increasing
of heap mem is not a long term solution. Can I use other operators to
refactor that LEFT OUTER JOIN ?  Does anyone has experience on it?

3. Or any other suggestions ?

Thanks @_@

-- 

                                               李响

E-mail             :[email protected]

Reply via email to