Why do you try to do 100,000 map tasks?  Also, do you mean that you had 100
nodes, each with 2GB?  If so, that is much too small a machine to try to run
1000 tasks on.  It is much better to run about the same number of tasks per
machine as you have cores (2-3 in your case).   Then you can easily split
your input into 100,000 pieces which will run in sequence.  For most
problems, however, it is better to let the system split your data so that
you get a few tens of seconds of work per split.  It is inefficient to have
very short tasks and it is inconvenient to have long-running tasks.

On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <[EMAIL PROTECTED]> wrote:

>
> Hi all
>
> Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB, Dual
> Core, 64Bit machine, Version: 0.16.4) cluster
>
> Map task does nothing but sleeping one minute
>
> I found that Jobtracker(1GB Heap) consumes about 650MB of heap memory when
> the job is 50% done.
>
> After all, the job failed at the 90% of progress because Jobtracker hanged
> up(?) due to out of memory.
>
> how do you handle this kind of issue?
>
> another related issue:
>
> while the above job was being processed, I clicked on the "Pending" on
> jobdatails.jsp of web UI
>
> then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a couple
> of minutes
>
>


-- 
ted

Reply via email to