Amar Kamat wrote:
Ted Dunning wrote:
Why do you try to do 100,000 map tasks? Also, do you mean that you
had 100
nodes, each with 2GB? If so, that is much too small a machine to try
to run
1000 tasks on. It is much better to run about the same number of
tasks per
machine as you have cores (2-3 in your case). Then you can easily
split
your input into 100,000 pieces which will run in sequence. For most
problems, however, it is better to let the system split your data so
that
you get a few tens of seconds of work per split. It is inefficient
to have
very short tasks and it is inconvenient to have long-running tasks.
On Thu, Sep 25, 2008 at 11:26 PM, 심탁길 <[EMAIL PROTECTED]> wrote:
Hi all
Recently I tried 100,000 dummy map tasks job on the 100ea node(2GB,
Dual
Core, 64Bit machine, Version: 0.16.4) cluster
I assume you are using hadoop-0.16.4. This issue got fixed in
hadoop-0.17 where the JT was made a bit more efficient in terms of
handling large number of fast finishing maps. See HADOOP-2119 for
more details.
I meant http://issues.apache.org/jira/browse/HADOOP-2119.
Amar
Amar
Map task does nothing but sleeping one minute
I found that Jobtracker(1GB Heap) consumes about 650MB of heap
memory when
the job is 50% done.
After all, the job failed at the 90% of progress because Jobtracker
hanged
up(?) due to out of memory.
how do you handle this kind of issue?
another related issue:
while the above job was being processed, I clicked on the "Pending" on
jobdatails.jsp of web UI
then, Jobtracker consumed 100% of CPU. and 100% CPU status lasted a
couple
of minutes