I actually solved the problem by increasing a parameter in
hadoop-site.xml, since the default wasn't sufficient:

<property>
  <name>mapred.child.java.opts</name>
  <value>-Xmx1024m</value>
</property>

Thanks,
Ryan


On Sun, Sep 21, 2008 at 12:59 AM, Ryan LeCompte <[EMAIL PROTECTED]> wrote:
> Yes I did, but that didn't solve my problem since I'm working with a fairly
> large data set (8gb).
>
> Thanks,
> Ryan
>
>
>
>
> On Sep 21, 2008, at 12:22 AM, Sandy <[EMAIL PROTECTED]> wrote:
>
>> Have you increased the heapsize in conf/hadoop-env.sh to 2000? This helped
>> me some, but eventually I had to upgrade to a system with more memory.
>>
>> -SM
>>
>>
>> On Sat, Sep 20, 2008 at 9:07 PM, Ryan LeCompte <[EMAIL PROTECTED]> wrote:
>>
>>> Hello all,
>>>
>>> I'm setting up a small 3 node hadoop cluster (1 node for
>>> namenode/jobtracker and the other two for datanode/tasktracker). The
>>> map tasks finish fine, but the reduce tasks are failing at about 30%
>>> with an out of memory error. My guess is because the amount of data
>>> that I'm crunching through just won't be able to fit in memory during
>>> the reduce tasks on two machines (max of 2 reduce tasks on each
>>> machine). Is this expected? If I had a large hadoop cluster, then I
>>> could increase the number of reduce tasks on each machine of the
>>> cluster so that not all of the data to be processed is occurring in
>>> just 4 JVMs on two machines like I currently have setup, correct? Is
>>> there any way to get the reduce task to not try and hold all of the
>>> data in memory, or is my only option to add more nodes to the cluster
>>> to therefore increase the number of reduce tasks?
>>>
>>> Thanks!
>>>
>>> Ryan
>>>
>

Reply via email to