This should be baked in by default. I don't think people use less that 4g these days On Jun 6, 2012 12:24 PM, "Vinod Singh" <[email protected]> wrote:
> Child heap size can be increased by passing command line options as well. > See the example given below- > > -Dmapred.map.child.java.opts=-Xmx6100m > -Dmapred.reduce.child.java.opts=-Xmx6100m > > Thanks, > Vinod > > http://blog.vinodsingh.com/ > > On Wed, Jun 6, 2012 at 3:20 PM, Sean Owen <[email protected]> wrote: > > > You need to increase the size of the children's heap. > > mapred.child.java.opts can be set to -Xmx4g for example. This is > > usually put in mapred-site.xml. > > > > Sampling does decrease the size of the intermediate outputs; probably > > not the final output so much. But this is not your problem. You are > > running out of heap on the workers. > > > > You should definitely use more than one reducer! It's really up to > > you, says Hadoop, to specify this, use -Dmapred.reduce.tasks=10 or > > whatever is appropriate. > > > > The name of the jobs kind of says what they do, and the javadoc says a > > little more. If you have specific questions I bet people can explain > > here. > > > > Sean > > > > > > On Wed, Jun 6, 2012 at 7:39 AM, Something Something > > <[email protected]> wrote: > > > Hello, > > > > > > I am running this job with a file containing 791,732,411 lines. > > > > > > Step 1 (PreparePreferenceMatrixJob-ItemIDIndexMapper-Reducer) > completed > > in > > > 3 minutes. > > > > > > Step 2 (PreparePreferenceMatrixJob-ToItemPrefsMapper-Reducer) took 2 > > hours > > > but completed successfully. It used only 1 Reducer so I am assuming > the > > > output is sorted, right? > > > > > > Step 3 (PreparePreferenceMatrixJob-ToItemVectorsMapper-Reducer) failed > > > after running for 54 minutes with 'Error: Java heap space' error & it > > was > > > all downhill from there. > > > > > > > > > Question: Are there any configuration parameters I can use to cut down > > > size of output? I noticed this in ToItemVectorsMapper: > > > > > > public static final String SAMPLE_SIZE = ToItemVectorsMapper.class + > > > ".sampleSize"; > > > > > > How do I cut down this sample size? > > > > > > Also, is there any documentation available that shows what each of > these > > > steps does? If not, I will just debug. Please let me know. Thanks. > > >
