[ 
https://issues.apache.org/jira/browse/HADOOP-4766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12658802#action_12658802
 ] 

amar_kamat edited comment on HADOOP-4766 at 12/23/08 1:14 AM:
--------------------------------------------------------------

I tried running 5 sleep jobs with 100,000 maps (1 sec wait) on 200 nodes back 
to back. Here are the runtimes 
||run-no||time||
|1|25min 58sec|
|2|26min 14sec|
|3|26min 19sec|
|4|25min 53sec|

Note that the total memory used after running 9 sleep jobs (100,000 maps with 1 
sec wait) back to back (few were killed) was ~384MB. Note that the cluster is 
configured to keep 0 jobs in memory (using the 
[patch|https://issues.apache.org/jira/secure/attachment/12395497/HADOOP-4766-v1.patch])
 and 5 min tracker-expiry-interval. This experiment was to prove that 
eventually the job leaves the JobTracker's memory.

      was (Author: amar_kamat):
    I tried running 5 sleep jobs with 100000 maps on 200 nodes back to back. 
Here are the runtimes 
||run-no||time||
|1|25min 58sec|
|1|25min 58sec|
|1|25min 58sec|
|1|25min 58sec|
  
> Hadoop performance degrades significantly as more and more jobs complete
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-4766
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4766
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.18.2, 0.19.0
>            Reporter: Runping Qi
>            Assignee: Amar Kamat
>            Priority: Blocker
>             Fix For: 0.18.3, 0.19.1, 0.20.0
>
>         Attachments: HADOOP-4766-v1.patch, map_scheduling_rate.txt
>
>
> When I ran the gridmix 2 benchmark load on a fresh cluster of 500 nodes with 
> hadoop trunk, 
> the gridmix load, consisting of 202 map/reduce jobs of various sizes, 
> completed in 32 minutes. 
> Then I ran the same set of the jobs on the same cluster, yhey completed in 43 
> minutes.
> When I ran them the third times, it took (almost) forever --- the job tracker 
> became non-responsive.
> The job  tracker's heap size was set to 2GB. 
> The cluster is configured to keep up to 500 jobs in memory.
> The job tracker kept one cpu busy all the time. Look like it was due to GC.
> I believe the release 0.18/0.19 have the similar behavior.
> I believe 0.18 and 0.18 also have the similar behavior.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to