Can you dig in more Hari?  When a child process won't go down, try
figuring what its doing?  Thread-dump it or study its logs?
St.Ack

On Tue, Dec 7, 2010 at 4:36 AM, Hari Sreekumar <[email protected]> wrote:
> Hi,
>
>       My cluster was running great till yesterday. Today, I submitted some
> jobs and I saw that the jobs were taking way too long. On investigation, I
> saw that the "Child" processes created by previous MR jobs were not getting
> killed, even though no jobs were running on the cluster, and there were like
> 40-50 child processes, each consuming memory, leading to huge swapping. When
> I kill -9 the child processes and re-run the jobs, I don't encounter this
> problem for some time, and then again the child processes don't get killed
> and eventually swapping happens. What could be the reason/solution?
>
> Thanks,
> Hari
>

Reply via email to